Released OpenWebSpider v0.1.2
CHANGELOG:
- BUG: fixed the regex used to extract URLs from (I)FRAME
- New feature: OpenWebSpider# can index images (new table: images)
- New feature: new command-line argument: −−images
- Improved Stress-test facility: now OpenWebSpider# doesn’t require a configuration file and a MySQL Server and it doesn’t check robots.txt (in stress-test mode)
- Timeout in execution of SQL queries set to 120 seconds (2 minutes)
- New feature: new configuration file fields: CRAWLER NAME and CRAWLER VERSION
- New feature: CRAWLER NAME used over robots.txt
Source code and binary are available in the package: Download