33769cc03b
Scrapy 1.5.0: This release brings small new features and improvements across the codebase. Some highlights: * Google Cloud Storage is supported in FilesPipeline and ImagesPipeline. * Crawling with proxy servers becomes more efficient, as connections to proxies can be reused now. * Warnings, exception and logging messages are improved to make debugging easier. * scrapy parse command now allows to set custom request meta via --meta argument. * Compatibility with Python 3.6, PyPy and PyPy3 is improved; PyPy and PyPy3 are now supported officially, by running tests on CI. * Better default handling of HTTP 308, 522 and 524 status codes. * Documentation is improved, as usual. Backwards Incompatible Changes * Scrapy 1.5 drops support for Python 3.3. * Default Scrapy User-Agent now uses https link to scrapy.org. **This is technically backwards-incompatible**; override :setting:USER_AGENT if you relied on old value. * Logging of settings overridden by custom_settings is fixed; **this is technically backwards-incompatible** because the logger changes from [scrapy.utils.log] to [scrapy.crawler]. If you're parsing Scrapy logs, please update your log parsers. * LinkExtractor now ignores m4v extension by default, this is change in behavior. * 522 and 524 status codes are added to RETRY_HTTP_CODES New features - Support <link> tags in Response.follow - Support for ptpython REPL - Google Cloud Storage support for FilesPipeline and ImagesPipeline - New --meta option of the "scrapy parse" command allows to pass additional request.meta - Populate spider variable when using shell.inspect_response - Handle HTTP 308 Permanent Redirect - Add 522 and 524 to RETRY_HTTP_CODES - Log versions information at startup - scrapy.mail.MailSender now works in Python 3 (it requires Twisted 17.9.0) - Connections to proxy servers are reused - Add template for a downloader middleware - Explicit message for NotImplementedError when parse callback not defined - CrawlerProcess got an option to disable installation of root log handler - LinkExtractor now ignores m4v extension by default - Better log messages for responses over :setting:DOWNLOAD_WARNSIZE and :setting:DOWNLOAD_MAXSIZE limits - Show warning when a URL is put to Spider.allowed_domains instead of a domain. Bug fixes - Fix logging of settings overridden by custom_settings; **this is technically backwards-incompatible** because the logger changes from [scrapy.utils.log] to [scrapy.crawler], so please update your log parsers if needed - Default Scrapy User-Agent now uses https link to scrapy.org. **This is technically backwards-incompatible**; override :setting:USER_AGENT if you relied on old value. - Fix PyPy and PyPy3 test failures, support them officially - Fix DNS resolver when DNSCACHE_ENABLED=False - Add cryptography for Debian Jessie tox test env - Add verification to check if Request callback is callable - Port extras/qpsclient.py to Python 3 - Use getfullargspec under the scenes for Python 3 to stop DeprecationWarning - Update deprecated test aliases - Fix SitemapSpider support for alternate links |
||
---|---|---|
.. | ||
ALTERNATIVES | ||
DESCR | ||
distinfo | ||
Makefile | ||
PLIST |