Compare Products
![]() |
![]() |
Features * Easy quick select feature - Just point & click on a webpage to extract the information you want. ParseHub will guess similar data elements for you. You can always switch out of the easy mode to use all of ParseHub’s advanced features.
* ParseHub API - Easily call your data and build products powered by ParseHub. You can also download data in CSV or JSON format.
* Flexible and powerful - Our intelligent relationship engine recognizes patterns in data for you. Want to manipulate text using regular expressions? Go for it! You also have the power to modify CSS selectors and edit the element attributes.
* Built for interactive & complicated websites - ParseHub is built to handle most poorly designed and difficult edge cases. You have the flexibility to combine our tools to handle redirects, forms, dropdowns, maps, infinite scroll, logins and any other AJAX and Javascript surprises.
* Split-second feedback loop - ParseHub instantly shows you a sample of data as you are working. You don’t need to wait to run and download files just to see your data.
* Seamless navigation between pages - Easily handle websites with thousands of linked pages and pagination. You can design a different set of instructions for each page, link pages together with one click or redirect to an entirely different url if you like without having to manually enter a bunch of urls.
* Automatic IP rotation - We route all requests through a pool of available IPs so you can maintain your privacy and anonymity.
* Cloud hosting & scheduling - Your data is stored for you and accessible at any time. You can also schedule to retrieve your data every minute, hour, day, week or month.
|
Features * Fast and powerful - write the rules to extract the data and let Scrapy do the rest.
* Easily extensible - extensible by design, plug new functionality easily without having to touch the core.
* Portable, Python - written in Python and runs on Linux, Windows, Mac and BSD.
* Built-in support for selecting and extracting data from HTML/XML sources using extended CSS selectors and XPath expressions, with helper methods to extract using regular expressions.
* An interactive shell console (IPython aware) for trying out the CSS and XPath expressions to scrape data, very useful when writing or debugging your spiders.
* Built-in support for generating feed exports in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem)
* Robust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations.
* Strong extensibility support, allowing you to plug in your own functionality using signals and a well-defined API (middlewares, extensions, and pipelines).
* Wide range of built-in extensions and middlewares for handling:
cookies and session handling
HTTP features like compression, authentication, caching, user-agent spoofing, robots.txt, crawl depth restriction
* A Telnet console for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawler
Plus other goodies like reusable spiders to crawl sites from Sitemaps and XML/CSV feeds, a media pipeline for automatically downloading images (or any other media) associated with the scraped items, a caching
* DNS resolver, and much more!
|
LanguagesOther |
LanguagesPython |
Source TypeClosed
|
Source TypeOpen
|
License TypeProprietary |
License TypeProprietary |
OS Type |
OS Type |
Pricing
|
Pricing
|
X
Compare Products
Select up to three two products to compare by clicking on the compare icon () of each product.
{{compareToolModel.Error}}Now comparing:
{{product.ProductName | createSubstring:25}} X