Tagged: web crawler

OpenWebSpider# v0.1.3

Released OpenWebSpider v0.1.3 CHANGELOG: New feature: CRAWLER NAME and CRAWLER VERSION used in the User-Agent string in HTTP Requests New feature: New configuration file field: sql_hostlist_where New feature: new command-line argument: –keep-dup BUG: fixed...

OpenWebSpider# v0.1.2

Released OpenWebSpider v0.1.2 CHANGELOG: BUG: fixed the regex used to extract URLs from (I)FRAME New feature: OpenWebSpider# can index images (new table: images) New feature: new command-line argument: −−images Improved Stress-test facility: now OpenWebSpider#...