RubyGems Navigation menu

medusa-crawler 1.0.0.pre.1

Medusa: a ruby crawler framework

Medusa is a ruby framework to crawl and collect useful information about the pages it visits. It is versatile, allowing you to write your own specialized tasks quickly and easily.

#### Features

  • Choose the links to follow on each page with ‘focus_crawl()`

  • Multi-threaded design for high performance

  • Tracks 301 HTTP redirects

  • Allows exclusion of URLs based on regular expressions

  • HTTPS support

  • Records response time for each page

  • Obey robots.txt

  • In-memory or persistent storage of pages during crawl using Moneta adapters.

  • Inherits OpenURI behavior (redirects, automatic charset and encoding detection, proxy configuration options).

Gemfile:
=

instalar:
=

Versões:

  1. 1.0.0 August 17, 2020 (23 KB)
  2. 1.0.0.pre.2 August 14, 2020 (23 KB)
  3. 1.0.0.pre.1 August 06, 2020 (24 KB)

Runtime Dependencies (3):

moneta ~> 1.3, >= 1.3.0
nokogiri ~> 1.3, >= 1.3.0
robotex ~> 1.0, >= 1.0.0

Donos:

Pushed by:

Autores:

  • Mauro Asprea, Chris Kite

SHA 256 checksum:

=

Total de downloads 5.060

Desta versão 1.442

Versão lançada:

Licença:

MIT

Versão Requerida do Ruby: >= 0

Versão Requerida do RubyGems: > 1.3.1

Links: