PINGDOM_CHECK

Web Scraping Copilot is live. Build Scrapy spiders 3× faster, free in VS Code.

Install Now
  • Data Services
  • Pricing
  • Login
    Sign up👋 Contact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
Introducing the Datasets Catalog: A Treasure Trove of Data
Light
Dark

Introducing the Datasets catalog

Read Time
2 Mins
Posted on
June 9, 2016
Folks using Portia and Scrapy are engaged in a variety of fascinating web crawling projects, so we wanted to provide you with a way to share your data extraction prowess with the world.
By
Cecilia Haynes
×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more
Subscribe to our Blog

Introducing the Datasets catalog

catal3

Folks using Portia and Scrapy are engaged in a variety of fascinating web crawling projects, so we wanted to provide you with a way to share your data extraction prowess with the world.

With this need in mind, we’re pleased to introduce the latest addition to our Zyte platform: Datasets Catalog!

This new feature allows you to immediately share the results of your Zyte projects as publicly searchable datasets. Not only is this a great way to collaborate with others, you can also save time by using other people’s datasets in your projects.

datasets_central_page

As fans of the open data movement, we hope that this new feature will ease the process of disseminating data. Open data has been used to help foster transparency in governmental and corporate systems worldwide. Researchers and developers have also benefited from the mutual sharing of information. A couple of our own engineers have even used open data to power transportation apps and to help journalists expose corruption.

Read on to get some ideas on how to use the Datasets Catalog in your workflow.

The Datasets Catalog at a Glance

We are launching the Datasets Catalog with the following features:

  • Publish the data collected by your Portia or Scrapy spiders/web crawlers as easily accessible datasets
  • Highlight your scraped data and help others locate the information they need by giving each dataset a name and a description
  • Let others discover your datasets through search engines like Google
  • Browse publicly available datasets that other people are sharing.
  • Choose how to share your dataset using three different privacy settings:
    • Public datasets are accessible by anyone (even those without a Zyte account) and are indexed by search engines
    • Restricted datasets are accessible only to the users that you explicitly grant access (they need to have a Zyte account)
    • Private datasets are accessible only by the members of your organization

How Does it Work?

publish datasetYou can find this new “Datasets” option in the menu located at the top navigation bar. On the main Datasets Catalog page, you can browse available datasets along with those that you have recently visited.

Publishing your scraped data into complete datasets takes just one click. This tutorial will get you started on publishing and sharing your extracted data.

Wrap Up

And there you have it, a way to not only showcase your web crawling and data extraction skills, but to also help others with the information that you provide.

We invite you to contribute your datasets and play your part in helping drive the open source movement forward. Reach out to us on Twitter and let us know what datasets you would like to see featured and if you have any recommendations for improving the whole Datasets experience.

We're excited to see what you come up with!

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026