AI-assisted data extraction

Articles from the Zyte blog about AI-assisted data extraction.

⌕

New Zyte add-ons: Agent Skills, Codex, GitHub and more updates

A host of additions to Zyte’s agentic scraping toolbelt helps developers go from prompt to working spider code and beyond.

Valter Sciarrillo10 min readJuly 22, 2026

An automatic Claude writer device, writing code.

AI-assisted data extraction

Claude Fable 5 is the new best model for writing scrapers

We ran nine models, including the new GPT-5.6, through our Zyte Scraping Code Benchmark. Claude Fable 5 puts Sol and others in the shade - but it’s pricey, and the best extraction code still depends on the best infrastructure.

Konstantin Lopukhin8 min readJuly 10, 2026

AI-assisted data extraction

The harness matters more than the model - Podcast EP07

"The model is the engine — but the harness is everything else." In Episode 7, we dig into why the infrastructure layer around your AI model matters more than the model itself, rank the best models available right now, and ask whether the open-weighted revolution is about to make frontier subscriptions obsolete.

John Rooney11 min readJune 27, 2026

AI-assisted data extraction

What's becoming of web scraping developers in the age of AI agents?

AI agents can generate code, suggest selectors, and draft crawl logic. What they can't do is design the system that decides when to stop, what to trust, and how to recover when the web pushes back. That job still belongs to a human.

Neha Setia Nagpal14 min readJune 9, 2026

AI-assisted data extraction

What multi-agent orchestration looks like in a large-scale web scraping project

Multi-agent orchestration is having its moment. The diagrams are everywhere now. Boxes for planners, boxes for hands, boxes for daemons, arrows to a shared brain, a human floating at the top. They keep getting prettier. The part where the web pushes back is still the part nobody draws.

Neha Setia Nagpal18 min readMay 31, 2026

AI-assisted data extraction

I built scraping agents for 30 days - here’s what I learned

For the last 30 days, I did one thing almost exclusively: I built scraping systems with AI agents, from the ground up, across real targets, with real deadlines. Not prototypes designed to impress in a demo, not isolated experiments running against a toy website, but production-grade pipelines that needed to ship and keep running.

John Rooney11 min readMay 25, 2026

AI-assisted data extraction

The Scrapy whisperer: Adrian Chaves on Web Scraping Copilot

An interview with Scrapy maintainer Adrian Chaves on Zyte’s Web Scraping Copilot, AI-generated parsing code, and building reliable scraping workflows.

Neha Setia Nagpal10 min readMarch 23, 2026

AI-assisted data extraction

Code is cheap, show me the talk: How copilots are re-engineering developers

Discover how AI copilots like Zyte’s Web Scraping Copilot are transforming developer workflows—making code a commodity and shifting value to problem-solving and prompting skills.

Ayan Pahwa10 min readMarch 20, 2026

AI-assisted data extraction

Claude skills, MCP or Web Scraping Copilot: Which should you choose?

Compare Claude skills, MCP servers, and Web Scraping Copilot to understand when to use each for AI-powered web scraping, data extraction, and production pipelines with Zyte API.

Neha Setia Nagpal10 min readMarch 11, 2026

AI-assisted data extraction

Supercharging web scraping with Claude skills

Learn how Claude skills can automate HTML fetching, AI parsing, selector generation, and structured data extraction to build faster, smarter web scraping workflows.

John Rooney10 min readMarch 11, 2026

AI-assisted data extraction

Claude Sonnet 4.6 is the new best model for writing scrapers

Claude Sonnet 4.6 is now the top model in Zyte’s Web Scraping Copilot benchmark, narrowly beating Gemini 3 Pro on extraction quality, with a small increase in code complexity.

Konstantin Lopukhin10 min readFebruary 18, 2026

AI-assisted data extraction

AI and the web: What 2025 changed and what comes next

2025 was the year AI learned to reason. From reasoning-first LLMs to autonomous agents and a reshaped web economy, this retrospective explores what changed—and what’s coming next.

Iván Sánchez10 min readDecember 22, 2025