Webinar

Assuring data quality in web extraction projects

James Kehoe

30 min read · March 2, 2023

Accurate web data for accurate decisions

The business value of any web extraction project relies not only on data but on quality data. Given the importance of quality in determining the value of the whole project and justifying the use of company resources, we brought two Data Quality specialists together to show us how early QA starts in any web scraping project and how.

Learn how to evaluate the data's accuracy, completeness, consistency, and timeliness, and get helpful tips on regularly monitoring the crawlers for quick identification and resolution of any issues that may arise.

That’s what’s covered in the fourth episode of the 5-part webinar series On-demand Guide to Accessing Web Data, brought to you by Zyte and its host Neha Setia, Web Data Evangelist, and a body of public data extraction specialists around the globe.

How the right definition of data quality can improve the QA process.
When to use automation tools like the open-source Spidermon to track spiders’ quality.
Best practices to detect and avoid issues.
Post-development and post-processing ideas to ensure data quality.

Hosted by

Pierluigi Vinciguerra, Co-Founder, and CTO at Re Analytics - Databoutique.com
Artur Sarduski - Data Scientist at Zyte
Neha Setia Nagpal - Web Data Evangelist at Zyte

More webinars

Keep watching

All webinars →

Case study

2026 Web Scraping Industry Report by Zyte

A practical walkthrough of the Web Scraping Industry Report 2026, covering how AI, automation, and access controls are reshaping web data collection at scale.

2 min read

Announcement

Master modern unblocking tactics against the latest anti-bot defenses

Learn how to prepare for modern anti-bot systems with advanced unblocking tactics.

2 min read

How To

Scrape, Analyze & Visualize Web Data with Streamlit

Join Hyder Khan | Data Engineer, @ Flipdish as he shares how to extract, clean, analyze, and visualize web data using a seamless workflow with Streamlit.

1 min read

Webinar

Assuring data quality in web extraction projects

James Kehoe

30 min read · March 2, 2023

Watch now

Accurate web data for accurate decisions

How the right definition of data quality can improve the QA process.
When to use automation tools like the open-source Spidermon to track spiders’ quality.
Best practices to detect and avoid issues.
Post-development and post-processing ideas to ensure data quality.

Hosted by

Pierluigi Vinciguerra, Co-Founder, and CTO at Re Analytics - Databoutique.com
Artur Sarduski - Data Scientist at Zyte
Neha Setia Nagpal - Web Data Evangelist at Zyte

More webinars

Keep watching

All webinars →

Case study

2026 Web Scraping Industry Report by Zyte

A practical walkthrough of the Web Scraping Industry Report 2026, covering how AI, automation, and access controls are reshaping web data collection at scale.

2 min read

Announcement

Master modern unblocking tactics against the latest anti-bot defenses

Learn how to prepare for modern anti-bot systems with advanced unblocking tactics.

2 min read

How To

Scrape, Analyze & Visualize Web Data with Streamlit

Join Hyder Khan | Data Engineer, @ Flipdish as he shares how to extract, clean, analyze, and visualize web data using a seamless workflow with Streamlit.

1 min read