Home/ Data Extraction Software/ Zyte/ Reviews
Extract clean and valuable data with the best scraping solution
30.8%
53.8%
15.4%
0%
0%
The best aspect of the service is the knowledgeable staff who are responsive and friendly.
There aren't any major downsides to using the platform.
Learn to adapt custom spiders for use with Scrapy Cloud to ease scaling your deployments as time goes on.
Crawlera isn't the silver bullet it once was, unfortunately. A sign of the times. Anti-scraping techniques are just getting better and better!
Scalability, availability and support effectiveness
web dashboard and stats api non stop problems
Scraping google at high speed
Scrapy is an absolute go-to for web scraping and scrapinghub does a phenomenal job of scaling your spiders and not worry about architecture. I could easily build our multiple spiders, deploy, monitor and develop in no time!
Crawlera and splash could get more improvements around the easy of coding and implementing the scripts.
Cut costs by easily building scalable web crawlers in-house
Trying to build a data harvesting infrastructure that can scale easily. Scrapinghub offers a one-stop-shop to deploy and forget about your spiders
Zyte provisions Smart Proxy Manager which handles the responsibility of sending appropriate headers in a request. This is crucial as we shouldn't send any x-forwarded-for or other proxy-related headers since they would appear on the target websites. We also leverage its x-crawlera-profile for replication of real browser specifications & user agent dependencies.
Optimization in terms of concurrent crawler domain requests would give us more advantages to us while working with Zyte Smart Proxy Manager. Since multiple requests need to be processed parallelly, there are few limitations on the cumulative crawler domain requests. Under these situations, we have to manually lower the concurrency & if it can be automated then it would be useful for us.
We had discrepancies in speed while using our own proxies. We shifted to Zyte Smart Proxy Manager to throttle our requests and avoid delays. In some use cases, we can also introduce delays for proper crawling of websites using its headless proxy tool. Its Dashboard offers plenty of insights for security overview, session details & request statestics based on our customization. For saving the capacity, we can go for its direct access feature which helps bypassing Crawlera during our requests execution.
the assistance of scrapinghub team (and I have no more comments into this subject of review. I know scrapinghub is a good extractor plattform, but as team manager so I miss quite tools to get reports of our projects)
not being able to extract reports of the projects with the same info of the job (Spider used, Items extracted, Requests done, Errors, Log, Runtime, Started, Finished, and also the periodic jobs info).
That would be very helpful to control de projects.
the same recommendations I've said above.
works quite efficiency, and has flexible tools to customize the spider to our needs.
Easy of use for Scrapy programmers (specially freelancers) and for users (their clients), free tier.
I consistently get "Sorry, can't find that page" in every navigation through the app (even the home page app.scrapinghub.com). Please fix that!
Find an easier way to keep key-value order. Customers need that a lot.
To easily setup and obtain results for clients who will use our spiders for their businesses. They find it amazing that it requires no setup, and can get their results instantly.
It gives me access to 80% more web content than any other proxy service I have used
That the subscription is time based. Right now the service dashboard is showing incorrect downloads which mean I have to pause the service for 6 days while the glitch is sorted out. I lose 6 days of activity and my subscription is time based and clean requests based. I would like to pay for a certain amount of clean requests and use them as I see fit without the concern that I might run out of time and lost clean requests for that month. Does this make sense to anyone?
Test and test again
I have access to public web content that was more difficult to get at before.
Effective data extraction and help to identify duplicates
Can not process heavy sets of data. Data from various sources. Big data
Data modeling
Mostly the versatility, ability to selectively scale and ease of deployment
I feel the compliance related to the scraped data is still a challenge and at times we'd to either delete the data or prune the data elements to remain legally pliable.
Compliance has to be the focus and should sit right above the speed and volume.
Understanding consumer behaviour, market trends and next best thing analytics have been the primary areas of our focus. And it was all driven by data, fetched through multiple ways (flat files, third party paid data, web scraped information etc.) Scraping hub helped us fetch cleaner version of our focused data and saved a ton of time in the data preparation.
Looking for the right SaaS
We can help you choose the best SaaS for your specific requirements. Our in-house experts will assist you with their hand-picked recommendations.
Want more customers?
Our experts will research about your product and list it on SaaSworthy for FREE.
We have been using scrapinghub for over a year for a production job and it has been very reliable and fits our needs. We also use crawlera for some scrapinghub jobs as well as multiple internal jobs. The documentation is great and the UI is informative and allows us to manage resources between applications and environments.
Nothing considerable. I was going to say notifications and alerts would be nice, but a quick search shows that there is a Monitoring AddOn for that.
Easily crawling to get page content without having to manage the infrastructure or the IP rotation.