PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Register now
Data Services
Pricing
Login
Try Zyte APIContact Sales
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    Enterprise

    DocumentationSupport

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    Coding Agent Add-Ons

    Agentic Web Data

    Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.

  • Data Services
  • Pricing
  • Blog

    Learn

    Case Studies

    Webinars

    Videos

    White Papers

    Join our Community
    Web scraping APIs vs proxies: A head-to-head comparison
    Blog Post
    The seven habits of highly effective data teams
    Blog Post
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications
Login
Try Zyte APIContact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
How To Set Up A Custom Proxy In Scrapy
Light
Dark

How to set up a custom proxy in Scrapy?

Read Time
5 Mins
Posted on
August 8, 2019
Handling Bans
By
Attila Toth
×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more
Subscribe to our Blog

How to set up a custom proxy in Scrapy?

When scraping the web at a reasonable scale, you may come across a series of problems and challenges. You may want to access a website from a specific country/region.

Or maybe you want to work around anti-bot solutions. Whatever the case, to overcome these obstacles you need to use and manage proxies.

In this article, I'm going to cover how to set up a custom proxy inside your Scrapy spider in an easy and straightforward way.

Also, we're going to discuss what are the best ways to solve your current and future proxy issues.

You will learn how to do it yourself but you can also just use Zyte Proxy Manager to take care of your proxies.

Why you need smart proxies for your web scraping projects?

If you are extracting data from the web at scale, you’ve probably already figured out the answer. IP banning. The website you are targeting might not like that you are extracting data even though what you are doing is totally ethical and legal.

When your scraper is banned, it can really hurt your business because the incoming data flow that you were so used to is suddenly missing.

Also, sometimes websites have different information displayed based on country or region.

To solve these problems, we use a set of techniques for bypassing IP bans, such as rotating proxies for successful requests to access the public data we need.

Setting up proxies in Scrapy

Setting up a proxy inside Scrapy is easy. There are two easy ways to use proxies with Scrapy - passing proxy info as a request parameter or implementing a custom proxy middleware.

Option 1: Via request parameters

Normally when you send a request in Scrapy you just pass the URL you are targeting and maybe a callback function. If you want to use a specific proxy for that URL you can pass it as a meta parameter, like this:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def start_requests(self):
for url in self.start_urls:
return Request(url=url, callback=self.parse,
headers={"User-Agent": "My UserAgent"},
meta={"proxy": "http://192.168.1.1:8050"})
def start_requests(self): for url in self.start_urls: return Request(url=url, callback=self.parse, headers={"User-Agent": "My UserAgent"}, meta={"proxy": "http://192.168.1.1:8050"})
def start_requests(self):
    for url in self.start_urls:
        return Request(url=url, callback=self.parse,
                       headers={"User-Agent": "My UserAgent"},
                       meta={"proxy": "http://192.168.1.1:8050"})

The way it works is that inside Scrapy, there’s a middleware called HttpProxyMiddleware which takes the proxy meta parameter from the request object and sets it up correctly as the used proxy. The middleware is enabled by default so there is no need to set it up.

Option 2: Create custom middleware

Another way to utilize proxies while scraping is to actually create your own middleware. This way the solution is more modular and isolated. Essentially, what we need to do is the same thing as when passing the proxy as a meta parameter:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from w3lib.http import basic_auth_header
class CustomProxyMiddleware(object):
def process_request(self, request, spider):
request.meta[“proxy”] = "http://192.168.1.1:8050"
request.headers[“Proxy-Authorization”] =
basic_auth_header(“<proxy_user>”, “<proxy_pass>”)
from w3lib.http import basic_auth_header class CustomProxyMiddleware(object): def process_request(self, request, spider): request.meta[“proxy”] = "http://192.168.1.1:8050" request.headers[“Proxy-Authorization”] = basic_auth_header(“<proxy_user>”, “<proxy_pass>”)
from w3lib.http import basic_auth_header

class CustomProxyMiddleware(object):
    def process_request(self, request, spider):
        request.meta[“proxy”] = "http://192.168.1.1:8050"
        request.headers[“Proxy-Authorization”] =
                          basic_auth_header(“<proxy_user>”, “<proxy_pass>”)

In the code above, we define the proxy URL and the necessary authentication info. Make sure that you also enable this middleware in the settings and put it before the HttpProxyMiddleware:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
DOWNLOADER_MIDDLEWARES = {
'myproject.middlewares.CustomProxyMiddleware': 350,
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 400,
}
DOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.CustomProxyMiddleware': 350, 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 400, }
DOWNLOADER_MIDDLEWARES = {
    'myproject.middlewares.CustomProxyMiddleware': 350,
    'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 400,
}

How to verify if your custom proxy is working?

To verify that you are indeed scraping using your proxy you can scrape a test site that tells you your IP address and location (like this one). If it shows the proxy address and not your computer’s actual IP it is working correctly.

Rotating proxies

Now that you know how to set up Scrapy to use a proxy you might think that you are done. Your IP banning problems are solved forever. Unfortunately not! What if the proxy we just set up gets banned as well? What if you need multiple proxies for multiple pages? Don’t worry there is a solution called IP rotation and it is key for successful scraping projects.

When you rotate a pool of IP addresses what you’re doing is essentially randomly picking one address to make the request in your scraper. If it succeeds, aka returns the proper HTML page, we can extract data and we’re happy. If it fails for some reason (IP ban, timeout error, etc...) we can’t extract the data so we need to pick another IP address from the pool and try again. Obviously, this can be a nightmare to manage manually so we recommend using an automated solution for this.

IP rotation in Scrapy

If you want to implement IP rotation for your Scrapy spider you can install the scrapy-rotating-proxies middleware which has been created just for this.

It will take care of the rotating itself, adjusting crawling speed, and making sure that we’re using proxies that are actually alive.

After installation and enabling the middleware you just have to add your proxies that you want to use as a list to settings.py:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
ROTATING_PROXY_LIST = [
'proxy1.com:8000',
'proxy2.com:8031',
# ...
]
ROTATING_PROXY_LIST = [ 'proxy1.com:8000', 'proxy2.com:8031', # ... ]
ROTATING_PROXY_LIST = [
    'proxy1.com:8000',
    'proxy2.com:8031',
    # ...
]

Also, you can customize things like ban detection methods, page retries with different proxies, etc…

Conclusion

So now you know how to set up a proxy in your Scrapy project and how to manage simple IP rotation. But if you are scaling up your scraping projects you will quickly find yourself drowned in proxy related issues. Thus, you will lose data quality and ultimately you will waste a lot of time and resources dealing with proxy problems.

For example, you will find yourself dealing with unreliable proxies, you’ll get poor success rate and poor data quality, etc... and really just get bogged down in the minor technical details that stop you from focusing on what really matters: making use of the data. How can we end this never-ending struggle? By using an already available solution that handles well all the mentioned headaches and struggles.

This is exactly why we created Zyte Proxy Manager . Zyte Proxy Manager enables you to reliably crawl at scale, managing thousands of proxies internally, so you don’t have to. You never need to worry about rotating or swapping proxies again. Here's how you can use Zyte Proxy Manager with Scrapy.

If you’re tired of troubleshooting proxy issues and would like to give Zyte Smart Proxy Manager a try then signup today. It has a 14-day FREE trial!

Useful links

  • Scrapy proxy rotating middleware
  • Discussion on Github about Socks5 proxies and scrapy
smart proxy manager

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026