One of the biggest pain points we’ve heard from our Zyte Smart Proxy Manager customers last year is the inconvenience of having to jump from one Zyte Smart Proxy Manager plan to another when more requests are needed in a month.
“How does Zyte Smart Proxy Manager work?” is the most common question we get asked from customers who after struggling for months (or years) with constant proxy issues, only to have them disappear completely when they switch to Zyte Smart Proxy Manager.
Let’s face it, managing your proxy pool can be an absolute pain and the biggest bottleneck to the reliability of your web scraping!
Many governments worldwide have laws enforcing them to publish their expenses, contracts, decisions, and so forth, on the web.
Support for JavaScript has been a much requested feature ever since Portia’s first release 2 years ago. The wait is nearly over and we are happy to inform you that we will be launching these changes in the very near future.
EuroPython 2015 is happening this week and we’re having the largest company meetup so far as a part of it, with more than 30 members from our fully remote-working team attending.
Earlier this week, Scrapinghub was invited along with several other fully-distributed companies to participate in a remote working Q&A hosted by Startups Canada.
Earlier this month we attended PyCon Philippines as a gold sponsor, presenting on the 2nd day. This was particularly exciting as it was the first time the whole Philippines team was together in one place and it was nice meeting each other in person!
We are very excited to be participating again this year on Google Summer of Code.
Here at Zyte we are a remote team of 100+ engineers distributed among 30+ countries. As part of their standard contract, Zytebers get 20 vacation days per year and local country holidays off, and yet we spent almost zero time managing this. How do we do it?. The answer is “git” and here we explain how.
Gender inequality is a hot topic in the tech industry. Over the last several years we’ve gathered business profiles for our clients, and we realized this data would prove useful in identifying trends in how gender and employment relations to one another.
At Zyte we're always building and running large crawls–last year we had 11 billion requests made on Scrapy Cloud alone.
In case you aren’t familiar with Portia, it’s an open-source tool we developed for visually scraping websites. Portia allows you to make templates of pages you want to scrape and uses those templates to create a spider to scrape similar pages.
We are veterans in the chat group arena. We have been using one form of another since we started Zyte in 2010 and I've been personally using corporate group chats since 2004.
Joanne O’Flynn meets with Pablo Hoffman and Shane Evans to find out what inspired them to set up web crawling company Zyte.
A common roadblock when developing spiders is dealing with sites that use a heavy amount of JavaScript. Many modern websites run entirely on JavaScript and require scripts to be run in order for the page to render properly.
We are proud to announce some exciting changes we've introduced this week. These changes bring a much more pleasant user experience, and several new features including the addition of Portia to our platform!
We’re proud to announce our new open source project, ScrapyRT! ScrapyRT, short for Scrapy Real Time, allows you to extract data from a single web page via an API using your existing Scrapy spiders.
One year ago we were looking back at the great 2013 we had and realized we would have quite a big challenge in front of us in order to have as much growth as we had during last year.
Here at Zyte, we love open source. We love using and contributing to it. Over these years we have open sourced a few projects, that we keep using over and over, in the hope that it will make others lives easier.
We’re excited to welcome Marcos Campal to the Zyte engineering team.
We are proud to introduce Zyte Smart Proxy Manager , a smart web downloader designed specifically for web crawling.
Our customers often ask us what's the best workflow for working with Scrapy projects.
We often have to write spiders that need to login to sites, in order to scrape data from them. Our customers provide us with the site, username and password, and we do the rest.
After a year considering it, we have decided to go ahead and drop support for Python 2.5 in Scrapy. Starting from 0.15, Scrapy will require Python 2.6 or above.