Explore resources

Get inspired

Blog

Four sweet spots for AI in web scraping

Theresia Tanzil

7 mins

July 14, 2025

Discover how AI and LLMs are enhancing web scraping with smarter crawling, fuzzy data extraction, automated spider generation, and intelligent QA.

Leadership

Blog

From script to system: 10 building blocks to scale web scraping

Theresia Tanzil

10 mins

June 30, 2025

Scaling your business’ web data gathering – acquiring, monitoring and storing a growing amount of data from a growing number of sources over time – requires holistic planning.

Leadership

Learn

Scrape Web Pages and Files Using Python, wget, and Zyte

Karlo Jedud

7 mins

June 27, 2025

The command-line utility wget (pronounced "web-get") can download online files. This free network downloader may run in the background without user intervention.

How To

Blog

New in Zyte: Scroll Control, Lower Costs, and More

Daniel Cave

5 min

June 27, 2025

As the web continues to evolve, Zyte API is evolving right alongside it—adding powerful new features and refinements designed to make data extraction smarter, faster, and more adaptable than ever.

Product Update

Blog

The future of Scrapy: Smarter, faster and ready for AI-powered scraping

Robert Andrews

6 min

June 23, 2025

What does the future hold for the tool some describe as “the gift that revolutionised web scraping”?

Open Source

Blog

Rise of the Data Vendor: How Outsourcing is Transforming Supply and Fuelling Businesses

Robert Andrews

6 min

June 20, 2025

With the emergence of managed data extraction vendors, businesses no longer need to gather web data themselves.

Open Source

Blog

Quality, focus and scale: Three ways data outsourcing benefits businesses

Theresia Tanzil

8 min

June 11, 2025

The Strategic Case for Buying Web Data: Quality, Focus, and Scale

Open Source

Blog

What AI Builders Need to Know About the Training Data Copyright Debate

Sanaea Daruwalla

6 min

June 9, 2025

The generative AI gold rush is upon us, with astounding new products and capabilities emerging that are fuelled by web data.

Blog

Ten years since Scrapy 1.0: The stats and stories behind your favorite framework

Cleber Alexandre

5 mins

June 5, 2025

See what 10 years of Scrapy 1.0 has produced — in milestones and metrics - as it became the most-used open source web scraping framework in the world.

Open Source

Learn

Using curl with a Proxy for Web Scraping

Karlo Jedud

8 mins

May 26, 2025

When it comes to command-line tools for HTTP requests, few are as versatile and powerful as curl. Loved by developers and system administrators alike, curl makes fetching web resources straightforward.

How To

Blog

What’s your data type? Solving the procurement problem

Theresia Tanzil

10 mins

May 22, 2025

Engagements with data suppliers break down when buyers don’t have a clear project concept. Understanding and articulating your needs is paramount. Meet the three types of data buyers. Which one are you?

Leadership

Blog

The rise of Scrapy: How an open-source scraping framework conquered the web

Theresia Tanzil

10 mins

May 14, 2025

The story of Scrapy reflects the broader evolution of the web itself and the ongoing quest to harness its ever-expanding ocean of information.

Leadership