Scrapy Scraping App

Scrapy

by scrapy

framework for crawling web sites and extracting structured data
Helps with: Scraping,Web Frameworks
Similar to: Web-Scraping-SDK App Apache Nutch App Event Registry news API App Screen Scraping App More...
Source Type: Open
License Types:
Supported OS:
Languages: Python

What is it all about?

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Scrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival.

Key Features

* Fast and powerful - write the rules to extract the data and let Scrapy do the rest. * Easily extensible - extensible by design, plug new functionality easily without having to touch the core. * Portable, Python - written in Python and runs on Linux, Windows, Mac and BSD. * Built-in support for selecting and extracting data from HTML/XML sources using extended CSS selectors and XPath expressions, with helper methods to extract using regular expressions. * An interactive shell console (IPython aware) for trying out the CSS and XPath expressions to scrape data, very useful when writing or debugging your spiders. * Built-in support for generating feed exports in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem) * Robust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations. * Strong extensibility support, allowing you to plug in your own functionality using signals and a well-defined API (middlewares, extensions, and pipelines). * Wide range of built-in extensions and middlewares for handling: cookies and session handling HTTP features like compression, authentication, caching, user-agent spoofing, robots.txt, crawl depth restriction * A Telnet console for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawler Plus other goodies like reusable spiders to crawl sites from Sitemaps and XML/CSV feeds, a media pipeline for automatically downloading images (or any other media) associated with the scraped items, a caching * DNS resolver, and much more!


Pricing

Yearly
Monthly
Lifetime
Free
Freemium
Trial With Card
Trial No Card
By Quote

Description

free

Alternatives

View More Alternatives

View Less Alternatives

Top DiscoverSDK Experts

User photo
500
Gábor László Hajba
Well-grounded software developer
Data Handling | Web and 17 more
View Profile
User photo
200
Noor Khan
Senior Software Engineer (Web)
GUI | Data Handling and 17 more
View Profile
User photo
60
Billy Joel Ranario
Full Stack Web Developer and Article Writer
GUI | Data Handling and 31 more
View Profile
User photo
20
Jeamar Paul Libres
Software Engineer, Web Developer, Android Developer
GUI | Web and 15 more
View Profile
Show All

Interested in becoming a DiscoverSDK Expert? Learn more

X

Compare Products

Select up to three two products to compare by clicking on the compare icon () of each product.

{{compareToolModel.Error}}

Now comparing:

{{product.ProductName | createSubstring:25}} X
Compare Now