Explore resources by topic or category
Browse by Category
Blog
Solution architecture: Conducting web scraping legal review
Sanaea Daruwalla
6 Mins
May 23, 2019
In this third post in our solution architecture series, we will share with you our step-by-step process for conducting a legal review of every web scraping project we work on.
Blog
Navigating compliance when extracting alternative data for finance
Sanaea Daruwalla
10 Mins
March 21, 2019
When it comes to using web data as alternative data for investment decision making, one topic rules them all: compliance.
Blog
Proxy management: In-house or off-the-shelf?
Ian Kerins
7 Mins
February 21, 2019
Proxy management is the thorn in the side of most web scrapers. Without a robust and fully featured proxy infrastructure, you will often experience constant reliability issues and hours spent putting out proxy fires - a situation no web scraping professional wants to deal with.
Blog
GDPR and Web Scraping: IIAP Europe Data Protection Congress
Sanaea Daruwalla
4 Mins
December 13, 2018
I was recently invited to speak at the IAPP Europe Data Protection Congress in Brussels about web scraping and GDPR.
Blog
Data Quality Assurance For Enterprise Web Scraping
Ian Kerins
7 Mins
September 27, 2018
When it comes to web scraping, one key element is often overlooked until it becomes a big problem.
Blog
What I Learned As A Google Summer Of Code Student At Zyte
Chau Tung Lam Nguyen Bhatt
4 Mins
September 12, 2018
Google Summer of Code (GSoC) was such a great experience for students like me. I learned so much about open source communities as well as contributing to their complex projects.
Blog
GDPR Compliance For Web Scrapers: The Step-by-step Guide
Sanaea Daruwalla
9 Mins
July 25, 2018
Unless you’ve been living under a rock for the past few months you know that the EU’s General Data Protection Regulation (GDPR) is upon us.
Blog
Do Androids Dream Of Electric Sheep?
Mikhail Korobov
18 Mins
June 19, 2017
It got very easy to do Machine Learning: you install an ML library like scikit-learn or xgboost, choose an estimator, feed it some training data, and get a model that can be used for predictions.
Blog
Promoting Open Data for Increased Economic Opportunities
Cecilia Haynes
8 Mins
October 19, 2016
During the 2016 Collision Conference held in New Orleans, our Content Strategist Cecilia Haynes interviewed conference speaker Dr. Tyrone Grandison.
Blog
Embracing The Future Of Work: How To Communicate Remotely
Cecilia Haynes
5 Mins
September 22, 2016
What does “the Future of Work” mean to you? To us, it describes how we approach life at Scrapinghub.
Blog
A Not-So-Short Story on Getting Decent Internet Access
agustin
7 Mins
April 27, 2016
This is a tale of trial, tribulation, and triumph. It is the story of how I overcame obstacles including an inconveniently placed grove of eucalyptus trees, armed with little more than a broom and a pair of borrowed binoculars, to establish a stable internet connection.
Blog
Vizlegal: The Rise of Machine-Readable Laws and Court Judgments
Sanaea Daruwalla
4 Mins
January 13, 2016