Scheduler
- The size of task queue is more accurate now, you can use it to determine all done status of scheduler.
Fetcher
- Fix tornado loss cookies while doing 30x redirects
- You can use cookies with cookie header at same time now
- Fix proxy not working bug.
- Enable proxy by default.
- Proxy now support username and password authorization. soloradish
- Etag and Last-Modified header will be disabled while last crawl is failed.
Databases
- MySQL default engine changed to InnoDB laapsaap
- MySQL, larger result column size, changed to MEDIUMBLOB(up to 16M) laapsaap
WebUI
- WebUI will use same arguments as the fetcher, fix proxy not word for webui bug.
- Results will be sorted in the order of updatetime.
One Mode
- Script exception logs would be printed to screen
New Command `send_message`
You can use the command `pyspider send_message [project] [message]` to send a message to project via command-line.
Other
- Using localhosted test web pages
- Remove version specify of lxml, you can use apt-get to install any version of lxml