PINGDOM_CHECK

How to use Zyte Smart Proxy Manager with Scrapy

Read Time

2 Mins

Posted on

November 14, 2019

Categories
Zyte Smart Proxy Manager is a proxy service, specifically designed for web scraping. In this article, you are going to learn how to use Zyte Smart Proxy Manager inside your Scrapy spider.

By

Attila Toth

Return to top

How to use Zyte Smart Proxy Manager with Scrapy

Zyte Smart Proxy Manager is a proxy service, specifically designed for web scraping. In this article, you are going to learn how to use Zyte Smart Proxy Manager inside your Scrapy spider.

How Zyte Smart Proxy Manager works

Zyte Smart Proxy Manager is a smart HTTP/HTTPS downloader. When you make requests using Zyte Smart Proxy Manager it routes them through a pool of IP addresses. When necessary, it automatically introduces delays between requests and discards IP addresses to help manage crawling challenges. And simply like that, it makes a successful request hassle-free.

Zyte Smart Proxy Manager with Scrapy

In order to use Zyte Smart Proxy Manager, you need to have an account with a Zyte Smart Proxy Manager subscription. If you haven’t signed up yet you can sign up here, it’s free. When you subscribe to a plan you will get an API key. You will need to use this API key in your Scrapy project to use Zyte Smart Proxy Manager.

Install Zyte Smart Proxy Manager middleware

First thing you need to do is to install the Zyte Smart Proxy Manager middleware:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
pip install scrapy-zyte-smartproxy
pip install scrapy-zyte-smartproxy
pip install scrapy-zyte-smartproxy

Scrapy settings

Next, add these lines to the project settings:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# enable the middleware
DOWNLOADER_MIDDLEWARES = {'scrapy_zyte_smartproxy.ZyteSmartProxyMiddleware': 610}
# enable Zyte Proxy
ZYTE_SMARTPROXY_ENABLED = True
# the APIkey you get with your subscription
ZYTE_SMARTPROXY_APIKEY = '<your_zyte_proxy_apikey>'
# enable the middleware DOWNLOADER_MIDDLEWARES = {'scrapy_zyte_smartproxy.ZyteSmartProxyMiddleware': 610} # enable Zyte Proxy ZYTE_SMARTPROXY_ENABLED = True # the APIkey you get with your subscription ZYTE_SMARTPROXY_APIKEY = '<your_zyte_proxy_apikey>'
# enable the middleware
DOWNLOADER_MIDDLEWARES = {'scrapy_zyte_smartproxy.ZyteSmartProxyMiddleware': 610}

# enable Zyte Proxy
ZYTE_SMARTPROXY_ENABLED = True

# the APIkey you get with your subscription
ZYTE_SMARTPROXY_APIKEY = '<your_zyte_proxy_apikey>'

Zyte Smart Proxy Manager settings

By using the middleware you add Zyte Smart Proxy Manager-specific settings to your project that you can configure. These settings can be overridden in Scrapy settings. For example, it’s recommended to set these:

  • disable the Auto Throttle addon
  • increase the maximum number of concurrent requests
  • increase the download timeout
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
AUTOTHROTTLE_ENABLED = False
CONCURRENT_REQUESTS = 32
CONCURRENT_REQUESTS_PER_DOMAIN = 32
DOWNLOAD_TIMEOUT = 600
AUTOTHROTTLE_ENABLED = False CONCURRENT_REQUESTS = 32 CONCURRENT_REQUESTS_PER_DOMAIN = 32 DOWNLOAD_TIMEOUT = 600
AUTOTHROTTLE_ENABLED = False
CONCURRENT_REQUESTS = 32
CONCURRENT_REQUESTS_PER_DOMAIN = 32
DOWNLOAD_TIMEOUT = 600 

If you want to learn more about the usage of Zyte Smart Proxy Manager , go to the support page.

. You can also learn the the story behind Smart Proxy Manager. Unveil the secrets behind its creation and the passion that fuels its journey.
proxy banner
automatic extraction