PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Register now
Data Services
Pricing
Login
Try Zyte APIContact Sales
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    Enterprise

    DocumentationSupport

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    Coding Agent Add-Ons

    Agentic Web Data

    Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.

  • Data Services
  • Pricing
  • Blog

    Learn

    Case Studies

    Webinars

    Videos

    White Papers

    Join our Community
    Web scraping APIs vs proxies: A head-to-head comparison
    Blog Post
    The seven habits of highly effective data teams
    Blog Post
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications
Login
Try Zyte APIContact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
Extract Summit 2021: Highlights and key takeaways
Light
Dark

Extract Summit 2021: Highlights and key takeaways

Read Time
4 Mins
Posted on
October 12, 2021
Announcement
It’s a wrap! Last week, for the third time, Extract Summit brought together web data experts and enthusiasts to learn, share and inspire.
By
Sarah Lang
×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more
Subscribe to our Blog

Extract Summit 2021: Highlights and key takeaways

*For information on 2022 Extract Summit visit this link*

It’s a wrap! Last week, for the third time, Extract Summit brought together web data experts and enthusiasts to learn, share and inspire. Sessions, workshops, panels, contests – this year’s summit had so much to offer, I don’t even know where to start.

Extract Summit at a glance

With all the uncertainties still swirling around COVID-19 in 2021, we decided to stay safe and host a virtual event again. Yet, we wanted to offer all attendees and speakers an outstanding experience and the opportunity to connect with each other. Using the event platform Hubilo helped us organize an interactive and fun event. 

“The event interface looks sick! I'm not gonna lie, remote conferences don't feel the same as the real thing because you don't ‘feel there’. I like how Zyte is trying to change that!”, said one of the 2000 participants.

We had a lot of fun connecting with all attendees, especially in those disconnected times. 

1 day, 2 tracks, 24 speakers: Key takeaways

Every year, we try to put together a well-mixed agenda delivered by inspiring web data extraction thought leaders and web scraping experts to give you the best overview of the current web data trends. This year, we had an amazing line-up of great speakers covering many different fields and aspects of web data. Here are a few highlights of the day and you can watch all the recordings here.

A demonstration of the hybrid web scraping approach, adaptive learning, and a sneak-peek into Zyte’s data extraction quality evaluation

Head of Data Science at Zyte, Konstantin Lopukhin, took a deep dive into the data extraction quality evaluation process, talked about what are common pitfalls and even gave insights on how Zyte is handling this to guarantee the highest quality of extracted web data.

Mikhail Korobov, Head of Development for Automatic Extraction at Zyte, guided us through his experiment on how he extracts 20 websites in 3 hours using a hybrid web scraping approach that uses a combination of the classic and fully automated methods. 

Continuing on the automated web scraping backed by machine learning, founder and CTO at Pandio, Joshua Odmark, gave a live demonstration of how adaptive learning with PandioML works. 

Talks about different use cases and lessons learned

We’re always keen to get to know how companies are using web data to thrive. Therefore it’s no surprise that we had a lot of interesting presentations showcasing the usage and importance of data. 

Abhijit HK, CEO at Codewave, shared with us his experience building data dashboards, and some hacks for building web scraping spiders. Niall Hurley, CEO at Eagle Alpha, introduced us to the world of alternative data for finance, explained the customer journey and gave us a few interesting use cases. Linus Nilsson from NilssonHedge showcased his hedge fund database – including input routines, cleaning strategies, and how he ensures it’s high quality.

System Developer at Codemill, Kabir Fahria presented a great case study on the use of web data for contextual advertising.

To give our audience also helpful tips, Eric Platow, Senior Architect at LexisNexis, took us on his journey of taming the world wide web and the lessons learned after scraping 100K.   

Legal hot topics in web data extraction

One all-time-favorite for all of us are the discussions around legal aspects. We had a panel full of experts: Victoria Vlahoyiannis and Kate O-Brien, Legal Counsels at Zyte, Tricia Higgins, the Co-founder and CEO of Fort Privacy, and Nina Fletcher, a Legal Counsel at YipitData.

Together they covered topics around website terms and conditions, when they are legally binding, GDPR in the context of web scraping as well as discussing the recent Van Buren case. 

Deep dives into anti-bot and headless browsers, an AMA session, and all things technical

As the biggest event within the web data extraction industry, Extract Summit also covers very technical topics. Thanks to our experts from diverse backgrounds, we were able to host an AMA session to answer burning questions about web scraping best practices, anti-ban management and reverse engineering.

Evgeny Slaikovsky, one of our talented reverse engineers, talked about the cat and mouse game of the evolution of anti-bots. Paweł Miech, Senior Technical Team Lead within the development department explained to us what headless browsers are and when we should and shouldn’t use them.

Rain Leander, Technical Evangelist at Cockroach Labs gave an overview about the world of data structure and storage and explored the pros and cons of 3 major types of database available today. 

Ljubica Lazarevic, Developer Advocate at Neo4j, showed us how she built a scraper and used a graph database to recommend conferences to submit talks to – an interesting session not only for our fellow developer advocates!

Scrapy and hands-on coding sessions

Showing us their web scraping skills in action, we had two live coding sessions:

Attila Tóth, Developer Advocate at Timescale guides us step-by-step through how to build a real estate market monitoring tool with Scrapy.

His colleague, Jônatas Paganini, showed us live how he builds a small blog scraper, TimescaleDB.

Live coding contest & other highlights

Besides the amazing talks Extract Summit had to offer, we wanted to give all developers the chance to show off their own web scraping skills, so we hosted a live coding contest. It was a huge fun for all involved! 

Talking about highlights, we definitely have to mention our live comedy show with Damian Clark and Eddie Mullarkey. They gave us some nice giggles and made the break a different kind of experience.

If you want to be a part of Extract Summit 2022, you can pre-register here.

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026