Valdir Stumm Junior
1 min read ·
The Scrapy tutorial part I: Learn How To Use Scrapy for Web Scraping
Developed by Pablo Hoffman and Shane Evans, Scrapy is an open-source python framework built specifically for web data extraction. With Scrapy spiders, you are able to download HTML, parse and process the data, and save it in either CSV, JSON, or XML file formats.
By default, Scrapy does not use proxies. However, as a web scraping framework, it has the ability to easily plug in multiple proxy services or allow a user to easily write custom code for using a bespoke proxy using Scrapy’s default HTTP Proxy downloader middleware. Services like Zyte Smart Proxy Manager (formally Crawlera) are built for easy use with scrapy and provide open source middlewares to easily incorporate their proxies into your scraping project.
This video covers the basics of web scraping using your web browser, Scrapy shell, and CSS selectors.
After watching this, you will know:
More learn articles
Use caseLearn what residential proxies are, how they compare to datacenter proxies, and why modern web scraping needs more than IP diversity.
10 min read
Use caseLearn how much rotating proxies cost, what affects pricing, and why total web scraping costs often go beyond proxy subscriptions.
10 min read
Use caseLearn how rotating proxies work, when to use them for web scraping, and why IP rotation alone is not enough for reliable data access.
10 min read
G2.com