PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Login Try Zyte API Contact Sales

Unblocking and Extraction
Zyte API
The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing
Ban Handling
Headless Browser
AI Extraction
SERP
Enterprise
Documentation Support
Hosting and Deployment
Scrapy Cloud
Run, monitor, and control your Scrapy spiders however you want to.
Coding Agent Add-Ons
Agentic Web Data
Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.
Data Services
Pricing
Browse
- BlogArticles, podcasts, videos
- Case studiesCustomer outcomes
- White papersIn-depth reports
- DocumentationGuides & API reference
- EventsConferences, webinars, recordings
Subscribe
- NewsletterSwiftly delivered
- Discord communityExtract Data community
Product and E-commerce
From e-commerce and online marketplaces
Data for AI
Collect and structure web data to feed AI
Job Posting
From job boards and recruitment websites
Real Estate
From Listings portals and specialist websites
News and Article
From online publishers and news websites
Search
Search engine results page data (SERP)
Social Media
From social media platforms online
Meet Zyte
Our story, people and values
Contact us
Get in touch
Support
Knowledge base and raise support tickets
Terms and Policies
Accept our terms and policies
Open Source
Our open source projects and contributions
Web Data Compliance
Guidelines and resources for compliant web data collection
Join the team building the future of web data
We're Hiring
Trust Center
Security, compliance & certifications

Login Try Zyte API Contact Sales

Scrapy and friends

Open source at our heart

Where it all started

Make building spiders a breeze

Scrapy is an open source Python framework built specifically for web scraping by Zyte co-founders Pablo Hoffman and Shane Evans. Out of the box, Scrapy spiders are designed to download webpage data (HTML, JSON, XML…), parse and process the data and save it in any structured data format (e.g. CSV, JSON, XML).

Robust web scraping capabilities

Powerful open source technology

Scrapy boasts a wide range of built-in extensions and middlewares designed for handling cookies and sessions as well as HTTP features like compression, authentication, caching, user-agents, robots.txt and crawl depth restriction. It is also very easy to extend through the development of custom middlewares or pipelines to your web scraping projects which can give you the specific functionality you require.

Documentation Learn

Giving you the power of Data Extraction

Scrapy

Scrapy is our open source web crawling framework written in Python. Scrapy is one of the most widely used and highly regarded frameworks of its kind; very powerful yet easy to use.

Github Scrapy.org

Spidermon

Spidermon is our battle-tested open source spider monitoring library for Scrapy.

DateParser

DateParser is our library for parsing human-readable dates and times. Supports 18 languages.

Eli5

A library for debugging machine learning classifiers and explaining their predictions.

Formasaurus

Formasaurus figures out the type of an HTML form using machine learning. Is it a login, search, sign up, password recovery, contact form, etc.

W3lib

W3lib provides a number of useful web-related functions for your web scraping projects.

ScrapyRT

ScrapyRT let’s you reuse your spider’s logic to extract data from web pages through a single HTTP request.

Queuelib

Queuelib lets you create disk-based queues in Python.

Parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Cssselect

CSS Selectors for Python

Itemloaders

Library to populate items using XPath and CSS with a convenient API

Itemadapter

Common interface for data container classes

Protego

A pure-Python robots.txt parser with support for modern conventions.

Price-parser

Extract price amount and currency symbol from a raw text string

Number-parser

Parse numbers written in natural language

Trusted by data-fueled organizations

Dev tools that make scraping easy

Zyte API

Unblock websites with one powerful API

Highest success rate with lowest response times
Lowest total cost of ownership
Highest compliance standards built in
Only pay for what you use
AI Scraping for product data

Zyte API - Ban Handling

Zyte API - AI Scraping

Zyte Data

Get web data delivered quickly and accurately.

We extract data for the largest companies in the world so they don't have to
Tell us about your project, we'll handle the rest
Leverage our world-class legal team to inform compliance
Standard and bespoke web data extraction projects

Start scraping the web in minutes

Documentation Scrapy Cloud

Services

Zyte Data

Coding tools & hacks straight to your inbox. Bi-weekly dosage of all things code.

Web Scraping API

Zyte API

Coding tools & hacks straight to your inbox. Bi-weekly dosage of all things code.

Developers

Zyte Developers

Coding tools & hacks straight to your inbox. Bi-weekly dosage of all things code.

Product & E-commerce
Data for AI
Job Posting
Real Estate
News & Articles
Search
Social Media

Blog
Learn
Case Studies
Webinars
White Papers
Join our community
Documentation

Meet Zyte
Contact us
Jobs
Support
Terms and Policies
Trust Center
Do not sell
Cookie settings

Web Data Compliance
Open Source
What is Web Scraping
Web Scraping in Python: Ultimate Guide
Stop getting blocked, start scraping

Most loved workplace certificate

Zyte reward

G2 reward

G2 reward

G2 reward

X Facebook Instagram YouTube LinkedIn Discord

© Zyte Group Limited 2026