PINGDOM_CHECK

Web Scraping Copilot is live. Build Scrapy spiders 3× faster, free in VS Code.

Install Now
  • Data Services
  • Pricing
  • Login
    Sign up👋 Contact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
Scrapy in 2026: New release brings modern async crawling standards
Light
Dark

Scrapy in 2026: New release brings modern async crawling standards

Read Time
6 min
Posted on
January 12, 2026
Scrapy 2.14.0 is released with a major under-the-hood modernization. Say goodbye to Twisted Deferreds.
By
Robert Andrews
IntroductionThe ‘async’ revolutionSmarter scheduling by defaultAction required: Clean up your spidersOther improvementsStack to the future
×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more
Subscribe to our Blog
Table of Contents

The world’s most-used open source data extraction framework just rang in the new year with a new release that brings a big structural shift.


If you have been awaiting Scrapy’s full embrace of modern Python async/await patterns, the new version 2.14.0 is the release you have been waiting for.

You might not see flashy new scraping tools in this release. But you will see a framework that is significantly more robust, future-proof, and aligned with modern standards. Think of this as an infrastructure upgrade, one that swaps out aging copper wiring for fiber optics.

The ‘async’ revolution

For years, Scrapy has relied heavily on Twisted’s Deferred objects. While powerful, they predate modern Python’s native async capabilities. In 2.14.0, Scrapy replaces a huge chunk of these internals with native coroutines.


This release introduces AsyncCrawlerProcess and AsyncCrawlerRunner. These are counterparts to the standard runners you know, designed to offer coroutine-based APIs.


What does this mean for you? If you are running Scrapy from a script (common in production pipelines), AsyncCrawlerProcess allows your crawler to play much nicer with other asyncio libraries.


It looks remarkably similar to the CrawlerProcess you are used to. You don't need to rewrite your setup entirely but, under the hood, you are now running on a modernized, coroutine-friendly foundation.

It is now easier to integrate Scrapy into broader asynchronous applications without fighting against conflicting event loops or legacy Deferred chains.

Smarter scheduling by default

If you run large-scale crawls, you know that managing concurrency is an art. In 2.14.0, the DownloaderAwarePriorityQueue is now the default priority queue.


Previously, Scrapy’s scheduler could be a bit "blind," pushing requests without fully understanding the downloader's current load for specific domains. The new default queue is "downloader aware" - it manages request priorities more intelligently based on the downloader's state.


You don’t need to change a single line of code; your crawls should simply run smoother, with fewer bottlenecks when scraping multiple domains.

Action required: Clean up your spiders

Scrapy is standardizing how spiders are configured, deprecating the use of class attributes for specific download settings in favor of the dictionary-based custom_settings.


The old way: If your spiders define download_timeout or user_agent directly as class attributes, you will start seeing warnings.

Other improvements

To keep the framework modern, Scrapy 2.14.0 has updated its requirements:


  • Automatic image Rotation: The ImagesPipeline now automatically transposes images based on EXIF data. If you scrape mobile-uploaded content (like real estate or classifieds), this fixes those annoying "sideways" photos automatically.

  • Python 3.9 support dropped: Requirements have been updated. Scrapy 2.14.0 now requires Python 3.10+.


Better custom download handlers: For advanced users building custom protocol handlers, the API has been documented and improved with a new BaseDownloadHandler class, making it easier to extend Scrapy’s core capabilities.

Stack to the future

Scrapy 2.14.0 is about longevity.


By adopting async internals and modernizing the scheduling logic, developers are ensuring Scrapy remains the go-to framework for serious web data extraction in 2026 and beyond.


Check out the full release notes or Scrapy website for more information.

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026
1import scrapy
2from scrapy.crawler import AsyncCrawlerProcess
3
4class MySpider(scrapy.Spider):
5    # Your spider definition
6    ...
7
8# The new AsyncCrawlerProcess works just like the classic one
9process = AsyncCrawlerProcess(
10    settings={
11        "FEEDS": {
12            "items.json": {"format": "json"},
13        },
14    }
15)
16
17process.crawl(MySpider)
18process.start() # script blocks here until crawling finishes
Copy
1class MySpider(scrapy.Spider):
2    name = 'myspider'
3    # Deprecated!
4    download_timeout = 30
5    user_agent = 'MyBot/1.0'
6The new way: Move these configurations into custom_settings. It keeps your spider logic clean and your configuration centralized.
7class MySpider(scrapy.Spider):
8    name = 'myspider'
9    # The Scrapy 2.14.0 Standard
10    custom_settings = {
11        'DOWNLOAD_TIMEOUT': 30,
12        'USER_AGENT': 'MyBot/1.0'
13    }
Copy