Released OpenWebSpider v0.1.3
CHANGELOG:
- New feature: CRAWLER NAME and CRAWLER VERSION used in the User-Agent string in HTTP Requests
- New feature: New configuration file field: sql_hostlist_where
- New feature: new command-line argument: –keep-dup
- BUG: fixed the regex used to extract URLs from <BASE>
- BUG: fixed in the function that extracts URLs
- BUG: fixed a bug in page.cs::normalizePage()
- BUG: fixed minor bugs
- BUG: fixed a bug in robots.txt’s parser
- BUG: fixed a bug in page-rels handler
Source code and binary are available in the package: Download