PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Data Services
Pricing
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    Enterprise

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    Coding Agent Add-Ons

    Agentic Web Data

    Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.

  • Data Services
  • Pricing
  • Blog

    Learn

    Case Studies

    Webinars

    Videos

    White Papers

    Web scraping APIs vs proxies: A head-to-head comparison
    Blog Post
    The seven habits of highly effective data teams
    Blog Post
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Register now
Login
Try Zyte API
Contact Sales
Documentation
Support
Join our Community
Login
Try Zyte API
Contact Sales
Join us
Where it all started

Make building spiders a breeze

Scrapy is an open source Python framework built specifically for web scraping by Zyte co-founders Pablo Hoffman and Shane Evans. Out of the box, Scrapy spiders are designed to download webpage data (HTML, JSON, XML…), parse and process the data and save it in any structured data format (e.g. CSV, JSON, XML).

Robust web scraping capabilities

Powerful open source technology

Scrapy boasts a wide range of built-in extensions and middlewares designed for handling cookies and sessions as well as HTTP features like compression, authentication, caching, user-agents, robots.txt and crawl depth restriction. It is also very easy to extend through the development of custom middlewares or pipelines to your web scraping projects which can give you the specific functionality you require.

DocumentationLearn

Giving you the power of Data Extraction

Scrapy

Scrapy is our open source web crawling framework written in Python. Scrapy is one of the most widely used and highly regarded frameworks of its kind; very powerful yet easy to use.

GithubScrapy.org

Spidermon

Spidermon is our battle-tested open source spider monitoring library for Scrapy.

Github

DateParser

DateParser is our library for parsing human-readable dates and times. Supports 18 languages.

Github

Eli5

A library for debugging machine learning classifiers and explaining their predictions.

Github

Formasaurus

Formasaurus figures out the type of an HTML form using machine learning. Is it a login, search, sign up, password recovery, contact form, etc.

Github

W3lib

W3lib provides a number of useful web-related functions for your web scraping projects.

Github

ScrapyRT

ScrapyRT let’s you reuse your spider’s logic to extract data from web pages through a single HTTP request.

Github

Queuelib

Queuelib lets you create disk-based queues in Python.

Github

Parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Github

Cssselect

CSS Selectors for Python

Github

Itemloaders

Library to populate items using XPath and CSS with a convenient API

Github

Itemadapter

Common interface for data container classes

Github

Protego

A pure-Python robots.txt parser with support for modern conventions.

Github

Price-parser

Extract price amount and currency symbol from a raw text string

Github

Number-parser

Parse numbers written in natural language

Github

Trusted by data-fueled organizations

Dev tools that make scraping easy

Zyte API

Unblock websites with one powerful API


  • Highest success rate with lowest response times

  • Lowest total cost of ownership

  • Highest compliance standards built in

  • Only pay for what you use

  • AI Scraping for product data

Zyte API - Ban HandlingZyte API - AI Scraping

Zyte Data

Get web data delivered quickly and accurately.


  • We extract data for the largest companies in the world so they don't have to

  • Tell us about your project, we'll handle the rest

  • Leverage our world-class legal team to inform compliance

  • Standard and bespoke web data extraction projects

Talk to usLearn more

Start scraping the web in minutes

DocumentationScrapy Cloud

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026
Scrapy and friends

Open source at our heart