Compare Products

Apache Nutch App Ficstar App

Features

* Fetching and parsing are done separately by default, this reduces the risk of an error corrupting the fetch parse stage of a crawl with Nutch. * Plugins have been overhauled as a direct result of removal of legacy Lucene dependency for indexing and search. * The number of plugins for processing various document types being shipped with Nutch has been refined. Plain text, XML, OpenDocument (OpenOffice.org), Microsoft Office (Word, Excel, Powerpoint), PDF, RTF, MP3 (ID3 tags) are all now parsed by the Tika plugin. The only parser plugins shipped with Nutch now are Feed (RSS/Atom), HTML, Ext, JavaScript, SWF, Tika & ZIP. * Distributed filesystem (via Hadoop) * Link-graph database * NTLM authentication

Features

* Custom Web Crawler - All web crawlers will be custom designed for your business needs. Whether you want to track prices or in-stock status, the custom web crawler is the ultimate tool to help you find exactly the data you want. * Web Data Collection - We specialize in advanced web data collection for crawling difficult sites with challenging scripts and collecting hard-to-get embedded web content. Our web crawlers scan millions of web pages in hours and save billions of records a day. * Data Processing - The system is able to intelligently match and compare data from different sources and refine all results in the exact output format you want. * Output Reporting - All data output is presented in high-quality reports in an aggregated format based on your requirements. Data files will be delivered on your required schedule through secure web servers. * Easy Data Input - Our no-hassle importing process gets your input data into the system. Or you can simply send us your input data, and we’ll take care of it. * Simple Scheduling Steps - Set up schedules the way you want to receive data deliveries and updates: daily, weekly, monthly, or however you want. * Automatic Email Notifications - Add your email address to the system to receive notifications any time you like for important messages such as data delivery or system errors.

Languages

Other

Languages

Other

Source Type

Open

Source Type

Closed

License Type

Apache

License Type

Proprietary

OS Type

OS Type

Pricing

  • Free Trial No Card, By Quotation

Pricing

  • By Quotation
X

Compare Products

Select up to three two products to compare by clicking on the compare icon () of each product.

{{compareToolModel.Error}}

Now comparing:

{{product.ProductName | createSubstring:25}} X
Compare Now