Released OpenWebSpider v0.1.2

CHANGELOG:

  • BUG: fixed the regex used to extract URLs from (I)FRAME
  • New feature: OpenWebSpider# can index images (new table: images)
  • New feature: new command-line argument: −−images
  • Improved Stress-test facility: now OpenWebSpider# doesn’t require a configuration file and a MySQL Server and it doesn’t check robots.txt (in stress-test mode)
  • Timeout in execution of SQL queries set to 120 seconds (2 minutes)
  • New feature: new configuration file fields: CRAWLER NAME and CRAWLER VERSION
  • New feature: CRAWLER NAME used over robots.txt

Source code and binary are available in the package: Download

Documentation of OpenWebSpider# v0.1

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Furl
  • Live
  • Reddit
  • Segnalo
  • StumbleUpon
  • Technorati
  • Upnews
  • Wikio
  • YahooMyWeb