PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Register now
Data Services
Pricing
Login
Try Zyte APIContact Sales
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    SERP

    Enterprise

    DocumentationSupport

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    Coding Agent Add-Ons

    Agentic Web Data

    Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.

  • Data Services
  • Pricing
  • Browse

    • BlogArticles, podcasts, videos
    • Case studiesCustomer outcomes
    • White papersIn-depth reports
    • EventsConferences, webinars, recordings

    Subscribe

    • NewsletterSwiftly delivered
    • Discord communityExtract Data community
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications
Login
Try Zyte APIContact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
All articles
AI60, 60 articles
Data quality13, 13 articles
Developer interest57, 57 articles
Integration2, 2 articles
Open-source40, 40 articles
Proxies29, 29 articles
Scraping practice17, 17 articles
Scraping strategy26, 26 articles
Web data60, 60 articles
Web scraping APIs33, 33 articles
Zyte API59, 59 articles
Scrapy48, 48 articles
Scrapy Cloud10, 10 articles
Web Scraping Copilot12, 12 articles
AI & Machine Learning1, 1 articles
Automotive2, 2 articles
E-commerce & retail26, 26 articles
Entertainment & Streaming2, 2 articles
Financial Services8, 8 articles
Government2, 2 articles
Market Research & Intelligence3, 3 articles
Media & publishing8, 8 articles
Real Estate2, 2 articles
Recruitment & HR3, 3 articles
Transportation & Logistics2, 2 articles
Travel & hospitality2, 2 articles
Extract Summit25, 25 articles
PyCon1, 1 articles

Appearance

Discord Community
BlogHow ToMachine Learning With Web Scraping: New MonkeyLearn Addon
ArticleHow To

Machine Learning With Web Scraping: New MonkeyLearn Addon

We deal in data. Vast amounts of it. But while we’ve been traditionally involved in providing you with the data that you need, we are now taking it a step

C

Cecilia Haynes

5 min read · April 14, 2016

Machine Learning With Web Scraping: New MonkeyLearn Addon

Machine learning with web scraping: New MonkeyLearn addon

Say Hello to the MonkeyLearn Addon

We deal in data. Vast amounts of it. But while we’ve been traditionally involved in providing you with the data that you need, we are now taking it a step further by helping you analyze it as well.

To this end, we’d like to officially announce the MonkeyLearn integration for Scrapy Cloud. This feature will bring machine learning technology to the data that you extract through Scrapy Cloud. We also offer a MonkeyLearn Scrapy Middleware so you can use it on your own platform.

Zyte-MonkeyLearn-Addon-02

MonkeyLearn is a classifier service that lets you analyze text. It provides machine learning capabilities like categorizing products or sentiment analysis to figure out if a customer review is positive or negative.

You can use MonkeyLearn as an addon for Scrapy Cloud. It only takes a minute to enable and once this is done, your items will flow through a pipeline directly into MonkeyLearn's service. You specify which field you want to analyze, which MonkeyLearn classifier to apply to it, and which field it should output the result to.

Say you were involved in the production of Batman vs Superman and you’re interested in how people reacted to your high budget movie. You could use Scrapy to track mentions of this film across the web and then use MonkeyLearn to perform sentiment analysis on the samples that you collect. But don't get too excited because you might not like the results of your search...

sad affleck

There are so many ways that you can use this addon with our platform, so we’ll be featuring a series of tutorials that will help you make the most out of this partnership.

Getting Started with MonkeyLearn

MonkeyLearn provides public modules that are already ready to go or you can create your own text analysis module by training a custom machine learning model.

For example, using traditional sentiment analysis on comments filled with trolls would return a 100% negative rating. To develop a “Troll Finder” you would need to create a custom model with a higher tolerance for the extreme negativity. You could create categories like "troll", "ubertroll", and "trollmaster" for further categorization. Check out MonkeyLearn’s tutorial to help you through this task.

Before you get started with the MonkeyLearn addon on Scrapy, you first need to sign up for the MonkeyLearn service. They offer a free account, so you don’t need to worry about the cash monies. Once you’ve signed up, you’ll be taken to your dashboard:

Screenshot 2016-04-12 16.48.46

Click the “Explore” option in the top menu to check out the whole range of ready-made classifiers that you can apply to the scraped data. There are a ton of different options to choose from including sentiment analysis for product reviews, language detectors, and extractors for useful data such as phone numbers and addresses.

Screenshot 2016-04-12 16.46.48

Choose the classifier that you’re interested in and make a note of its ID. You can find the ID in the URL:

Screenshot 2016-04-12 16.32.48 copy

And now that you’re all set on the MonkeyLearn side, it’s time to head back over to Scrapy Cloud.

Addon Walkthrough

You can access the MonkeyLearn addon through your dashboard. Navigate to Addons
Setup:

Add ons page

Enable the addon and click Configure:

Screenshot 2016-04-08 17.29.52

Head down to Settings:

Screenshot 2016-04-12 16.35.11 copy

To configure the addon, you need to set your MonkeyLearn API key, specify the classifier you want to use and the field in which you want the result to be stored. You’ll need the classifier ID you chose earlier from the MonkeyLearn platform.

MonkeyLearn reads the content from the classifier fields you've specified, performs the classification task on the data, and returns the result of the classification/analysis in the field that you defined as categories field.

For example, in order to detect the category of a movie based on the title, you would need to add the ID from the module you want to use in the first text box. In the second text box you would list your authorization token and the item field you want to analyze (title, in our case) in the third text box. In the fourth text box you would list the name of the field that is going to store the results from MonkeyLearn.

Screenshot 2016-04-12 16.34.57 copy

And you’re all done! Locked and loaded and ready to go with MonkeyLearn.

Using MonkeyLearn with Scrapy

The MonkeyLearn addon is a part of Scrapy Cloud, so you can use it with your Scrapy spiders. Scrapy is also open source, so you can easily run it on your own system.

The addon means you don’t need to worry about learning MonkeyLearn’s API and how to route requests manually. If you need to use MonkeyLearn outside of Scrapy Cloud, you can use the middleware for the same purpose.

When to use MonkeyLearn

We’re really excited about this integration because it is a huge step in closing the gap between data acquisition and analysis.

MonkeyLearn offers a range of text analysis services including:

  • Classifying products into categories based on their name
  • Detecting the language of text
  • Sentiment analysis
  • Keyword extractor
  • Taxonomy classifier
  • News categorizer
  • Entity extraction

We’ll delve into what you can do with each of these tools in future tutorials. For now, feel free to experiment and explore this integration in your web scraping projects.

Wrap Up

Data and textual analysis is more efficient by combining MonkeyLearn's machine learning capabilities with our data extraction platform. Whether you are using this for personal projects (tracking and monitoring advance reviews for Captain America: Civil War [Team Cap]) or for professional tasks, we're excited to see what you come up with.

Keep your eyes peeled, the first tutorial will walk you through using the Retail Classifier with the MonkeyLearn addon. Sign up for free for Scrapy Cloud and for Monkeylearn and give this addon a whirl.

Try Zyte API

Build your first scraper in minutes

Free trial, no credit card. From a single request to production in an afternoon.

Get started
How To
C

Cecilia Haynes

More from this author

In this article

  • Say Hello to the MonkeyLearn Addon
  • Getting Started with MonkeyLearn
  • Addon Walkthrough
  • Using MonkeyLearn with Scrapy
  • When to use MonkeyLearn
  • Wrap Up

Follow

Get the latest

Zyte and the data web in your inbox — or wherever you already are.

Subscribe

Or follow elsewhere

Continue reading

Teaching AI to scrape like a pro: how we measure LLMs’ data quality
How To

Teaching AI to scrape like a pro: how we measure LLMs’ data quality

AI-enabled code editors can now conjure scraping code on command. But is it any good? Here’s how Zyte re-engineered LLMs with Web Scraping Copilot to drive best-in-class output.

Theresia Tanzil·10 min·February 23, 2026
Analyze web data quickly with Jupyter Notebooks and Zyte API
How To

Analyze web data quickly with Jupyter Notebooks and Zyte API

With AI Scraping in Zyte API, you can pull data from any e-commerce website straight into your Jupyter notebooks.

Neha Setia Nagpal·2 mins·December 13, 2024
Overcoming web scraping challenges of Puppeteer and Playwright
How To

Overcoming web scraping challenges of Puppeteer and Playwright

Discover the challenges of scaling web scraping with Playwright & Puppeteer, from browser farm management to IP rotation and anti-scraping tactics.

Neha Setia Nagpal·1 mins·December 5, 2024

The Community · Newsletter

The best of Zyte and the data web, in your inbox.

One curated edition — new articles, product updates, and the stories shaping the data web. No noise.

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026