Commit graph

29 commits

Author SHA1 Message Date
schmonz
c95fc1acb0 Update to 5.2.1. From the changelog:
* Fix #22 (pip package keeps upgrading all the time)
* Support PyPy
* Remove the HTTP Status 9001 test that caused unit test tracebacks
* Remove the completely-untested HTML tidy code
* Remove BeautifulSoup as a dependency
* Remove the XFN microformat parsing code
* Remove the rel_enclosure microformat parsing code
* Remove the rel_hcard microformat parsing code
* Remove the rel_tag microformat parsing code
* Replace the regex-based RFC 822 date parser with a procedural one
* Replace the Python-licensed W3DTF date parser
* Support HTML5 audio/source/video element relative URL's
* Remove the unparsed itunes_keywords key from the result dictionary
* Fix issue 321 just a little more (yet another code path was missed)
* Issue 62 (support georss and gml namespaces)
* Issue 296 (GUID's are always treated like relative URI's)
* Issue 334 (media:restriction element content is not returned)
* Issue 335 (sub-elements of media:group are not parsed and returned)
* Issue 342 (support multiple dc:creator elements)
* Issue 357 (loose parser breaks ampersands in link element URL's)
* Issue 374 (support the Podlove Simple Chapters namespace)
* Issue 380 (support media:rating element)
* Issue 384 (fix chardet support in Python 3)
* Issue 389 (elements in unknown uppercase namespaces are ignored)
* Issue 392 (tags element subverts 'tags' key in result dictionary)
* Issue 396 (Podlove Simple Chapters version 1.0 causes a KeyError)
* Issue 399 (docs call `request_headers` parameter `extra_headers`)
* Issue 401 (support additional dcterms and media namespaces elements)
* Issue 404 (support asctime datetime strings with timezone information)
* Issue 407 (decode forward slashes encoded as character entities)
* Issue 421 (delay chardet invocation as long as possible)
* Issue 422 (add return types docstrings)
* Issue 433 (update the list of allowed MathML elements and attributes)
2015-10-31 14:18:32 +00:00
wiz
aa67e11089 Mark packages as not ready for python-3.x where applicable;
either because they themselves are not ready or because a
dependency isn't. This is annotated by
PYTHON_VERSIONS_INCOMPATIBLE=  33 # not yet ported as of x.y.z
or
PYTHON_VERSIONS_INCOMPATIBLE=  33 # py-foo, py-bar
respectively, please use the same style for other packages,
and check during updates.

Use versioned_dependencies.mk where applicable.
Use REPLACE_PYTHON instead of handcoded alternatives, where applicable.
Reorder Makefile sections into standard order, where applicable.

Remove PYTHON_VERSIONS_INCLUDE_3X lines since that will be default
with the next commit.

Whitespace cleanups and other nits corrected, where necessary.
2014-01-25 10:29:56 +00:00
schmonz
fb88ef19ee The revived rss2email (not yet in pkgsrc) requires Python 3.2 or
higher, and depends on this, so this must work with Python 3.2 or
higher.
2013-06-06 01:57:55 +00:00
schmonz
42b6204ba5 Update to 5.1.3. From the changelog:
* Consolidated and simplified the character encoding detection code
* Issue 346 (the gb2312 encoding isn't always upgraded to gb18030)
* Issue 350 (HTTP Last-Modified example is incorrect in documentation)
* Issue 352 (importing lxml.etree changes what exceptions libxml2 throws)
* Issue 356 (add support for the HTML5 attributes `poster` and `preload`)
* Issue 364 (enclosure-sniffing microformat code can throw ValueError)
* Issue 373 (support RFC822-ish dates with swapped days and months)
* Issue 376 (uppercase 'X' in hex character references cause ValueError)
* Issue 382 (don't strip inline user:password credentials from FTP URL's)
2013-01-14 14:03:58 +00:00
asau
1f96787c11 Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days. 2012-10-25 06:55:37 +00:00
schmonz
f63b4f238c Update to 5.1.2. From the changelog:
* Minor changes to the documentation
* Strip potentially dangerous ENTITY declarations in encoded feeds
* feedparser will now try to continue parsing despite compression errors
* Fix issue 321 a little more (the initial fix missed a code path)
* Issue 337 (`_parse_date_rfc822()` returns None on single-digit days)
* Issue 343 (add magnet links to the ACCEPTABLE_URI_SCHEMES)
* Issue 344 (handle deflated data with no headers nor checksums)
* Issue 347 (support `itunes:image` elements with a `url` attribute)
* Fix mistakes, typos, and bugs in the unit test code
* Fix crash in Python 2.4 and 2.5 if the feed has a UTF_32 byte order mark
* Replace the RFC822 date parser for more extensibility
* Issue 304 (handle RFC822 dates with timezones like GMT+00:00)
* Issue 309 (itunes:keywords should be split by commas, not whitespace)
* Issue 310 (pubDate should map to `published`, not `updated`)
* Issue 313 (include the compression test files in MANIFEST.in)
* Issue 314 (far-flung RFC822 dates don't throw OverflowError on x64)
* Issue 315 (HTTP server for unit tests runs on 0.0.0.0)
* Issue 321 (malformed URIs can cause ValueError to be thrown)
* Issue 322 (HTTP redirect to HTTP 304 causes SAXParseException)
* Issue 323 (installing chardet causes 11 unit test failures)
* Issue 325 (map `description_detail` to `summary_detail`)
* Issue 326 (Unicode filename causes UnicodeEncodeError if locale is ASCII)
* Issue 327 (handle RFC822 dates with extraneous commas)
* Issue 328 (temporarily map `updated` to `published` due to issue 310)
* Issue 329 (escape backslashes in Windows path in docs/introduction.rst)
* Issue 331 (don't escape backslashes that are in raw strings in the docs)
2012-05-26 16:51:59 +00:00
obache
8745cdd5e7 No compiler is requilred. 2012-02-04 12:45:36 +00:00
joerg
527ac6f204 Simplify. Don't allow Python 3 due to unsupported setuptools dependency. 2012-01-12 18:28:30 +00:00
schmonz
37c917eecf On a system without setuptools, this fails to build, therefore it
must be an egg.
2012-01-11 23:31:44 +00:00
schmonz
ba6f8d094c Update to 5.1. From the changelog:
* Extensive, extensive unit test refactoring
* Convert the Docbook documentation to ReST
* Include the documentation in the source distribution
* Consolidate the disparate README files into one
* Support Jython somewhat (almost all unit tests pass)
* Support Python 3.2
* Fix Python 3 issues exposed by improved unit tests
* Fix international domain name issues exposed by improved unit tests
* Issue 148 (loose parser doesn't always return unicode strings)
* Issue 204 (FeedParserDict behavior should not be controlled by `assert`)
* Issue 247 (mssql date parser uses hardcoded tokyo timezone)
* Issue 249 (KeyboardInterrupt and SystemExit exceptions being caught)
* Issue 250 (`updated` can be a 9-tuple or a string, depending on context)
* Issue 252 (running setup.py in Python 3 fails due to missing sgmllib)
* Issue 253 (document that text/plain content isn't sanitized)
* Issue 260 (Python 3 doesn't decompress gzip'ed or deflate'd content)
* Issue 261 (popping from empty tag list)
* Issue 262 (docs are missing from distribution files)
* Issue 264 (vcard parser crashes on non-ascii characters)
* Issue 265 (http header comparisons are case sensitive)
* Issue 271 (monkey-patching sgmllib breaks other libraries)
* Issue 272 (can't pass bytes or str to `parse()` in Python 3)
* Issue 275 (`_parse_date()` doesn't catch OverflowError)
* Issue 276 (mutable types used as default values in `parse()`)
* Issue 277 (`python3 setup.py install` fails)
* Issue 281 (`_parse_date()` doesn't catch ValueError)
* Issue 282 (`_parse_date()` crashes when passed `None`)
* Issue 285 (crash on empty xmlns attribute)
* Issue 286 ('apos' character entity not handled properly)
* Issue 289 (add an option to disable microformat parsing)
* Issue 290 (Blogger's invalid img tags are unparseable)
* Issue 292 (atom id element not explicitly supported)
* Issue 294 ('categories' key exists but raises KeyError)
* Issue 297 (unresolvable external doctype causes crash)
* Issue 298 (nested nodes clobber actual values)
* Issue 300 (performance improvements)
* Issue 303 (unicode characters cause crash during relative uri resolution)
* Remove "Hot RSS" support since the format doesn't actually exist
* Remove the old feedparser.org website files from the source
* Remove the feedparser command line interface
* Remove the Zope interoperability hack
* Remove extraneous whitespace
2012-01-11 16:50:52 +00:00
drochner
9b91067581 -update to 5.0.1
changes: fixes for issues:
 -invalid text in XML declaration causes sanitizer to crash
 -sanitization can be bypassed by malformed XML comments
 -sanitizer doesn't strip unsafe URI schemes
-add test target
2011-03-16 16:43:35 +00:00
schmonz
45bd8ccdd0 Update to 5.0. From the changelog:
* Improved MathML support
* Support microformats (rel-tag, rel-enclosure, xfn, hcard)
* Support IRIs
* Allow safe CSS through sanitization
* Allow safe HTML5 through sanitization
* Support SVG
* Support inline XML entity declarations
* Support unescaped quotes and angle brackets in attributes
* Support additional date formats
* Added the request_headers argument to parse()
* Added the response_headers argument to parse()
* Support multiple entry, feed, and source authors
* Officially make Python 2.4 the earliest supported version
* Support Python 3
* Bug fixes, bug fixes, bug fixes
2011-01-28 01:41:06 +00:00
schmonz
8d58805fd6 Update to 4.2-pre-294. From the commit log:
* Handle Jacques distler's nested svg/mathml
* Render correctly when item description contains <code> with <br />
* Add "controls" attribute for video
* Strip "autoplay" attribute from video
2009-08-05 05:01:14 +00:00
schmonz
2ba1971180 Update to a prerelease 4.2 snapshot, as 4.1 no longer copes
particularly well with many feeds and there's no indication that
a release is imminent. From the changelog:

* Support for parsing microformats, including rel=enclosure, rel=tag,
  XFN, and hCard.
* Updated the whitelist of acceptable HTML elements and attributes based
  on the latest draft of the HTML 5 specification.
* Support for CSS Sanitization. (Previous versions of Universal Feed
  Parser simply stripped all inline styles.) Many thanks to Sam Ruby for
  implementing this, despite my insistence that it was impossible.
* Support for SVG Sanitization.
* Support for MathML Sanitization. Many thanks to Jacques Distler for
  patiently debugging this feature.
* IRI support for every element that can contain a URI.
* Ability to disable relative URI resolution.
* Command-line arguments and alternate serializers, for manipulating
  Universal Feed Parser from shell scripts or other non-Python sources.
* More robust parsing of author email addresses, misencoded win-1252
  content, rel=self links, and better detection of HTML content in
  elements with ambiguous content types.
2008-08-07 14:56:11 +00:00
joerg
ba171a91fa Add DESTDIR support. 2008-06-12 02:14:13 +00:00
joerg
a77e7015fe Update PYTHON_VERSIONS_COMPATIBLE
- assume that Python 2.4 and 2.5 are compatible and allow checking for
fallout.
- remove PYTHON_VERSIONS_COMPATIBLE that are obsoleted by the 2.3+
default. Modify the others to deal with the removals.
2008-04-25 20:39:06 +00:00
joerg
5911def816 Recursive revision bump / recommended bump for gettext ABI change. 2006-02-05 23:08:03 +00:00
schmonz
68d0fdc412 Update to 4.1. From the changelog:
* removed socket timeout
* added support for chardet library
2006-01-24 07:01:06 +00:00
schmonz
9d98d4ded6 Update to 4.0.2. From the changelog:
* bug fixes for Python 2.1 compatibility
* cleared _debug flag
2005-12-28 06:59:27 +00:00
schmonz
c9be47ee83 Update to 4.0. From the changelog:
* Support for Atom 1.0.
* Support for iTunes extensions.
* Support for dc:contributor.
* Universal Feed Parser now captures the feed's namespaces. See
  Namespace Handling for details.
* Lots of things have been renamed to match Atom 1.0 terminology.
  issued is now entries[i].published, modified is now entries[i].updated,
  and url is now href everywhere. You can still access these elements
  with the old names, so you shouldn't need to change any existing
  code, but don't be surprised if you can't find them during
  debugging.
* category and categories have been replaced by tags, see feed.tags
  and entries[i].tags. The old names still work.
* mode is gone from all detail and content dictionaries. It was
  never terribly useful, since Universal Feed Parser unescapes
  content automatically.
* entries[i].source is now a dictionary of feed metadata as per
  section 4.2.11 of RFC 4287. Universal Feed Parser no longer
  supports the RSS 2.0's source element.
* Content in unknown namespaces is no longer discarded (bug 993305)
* Lots of other bug fixes.
2005-12-27 14:33:22 +00:00
tv
f816d81489 Remove USE_BUILDLINK3 and NO_BUILDLINK; these are no longer used. 2005-04-11 21:44:48 +00:00
agc
c71cac836a Add RMD160 digests to the SHA1 ones. 2005-02-24 14:48:39 +00:00
schmonz
8ee5c315f0 Accept Python 2.4. 2005-01-27 03:46:30 +00:00
recht
367eed19fe Build Python with thread support by default and turn the existing
python*-pth packages into meta-packages which will install the non-pth
packages. Bump PKGREVISIONs on the non-pth versions to propagate the
thread change, but leave the *-pth versions untouched to not affect
existing installations.
Sync all PYTHON_VERSIONS_AFFECTED lines in package Makefiles.
2005-01-23 20:41:45 +00:00
schmonz
ea1406e9da Fix PLIST. 2004-08-28 14:53:01 +00:00
recht
4150812b27 add python as category
ok'd a while back at pkgsrcCon by agc and wiz
2004-07-22 09:15:59 +00:00
schmonz
010c97ae5a Update to 3.3.
Changes in 3.2:

* use cjkcodecs and iconv_codec if available
* always convert feed to UTF-8 before passing to XML parser
* completely revamped logic for determining character encoding and
    attempting XML parsing (much faster)
* increased default timeout to 20 seconds
* test for presence of Location header on redirects
* added tests for many alternate character encodings
* support various EBCDIC encodings
* support UTF-16BE and UTF16-LE with or without a BOM
* support UTF-8 with a BOM
* support UTF-32BE and UTF-32LE with or without a BOM
* fixed crashing bug if no XML parsers are available
* added support for "Content-encoding: deflate"
* send blank "Accept-encoding: " header if neither gzip nor zlib
    modules are available

Changes in 3.3:

* optimize EBCDIC to ASCII conversion
* fix obscure problem tracking xml:base and xml:lang if element
    declares it, child doesn't, first grandchild redeclares it,
    and second grandchild doesn't
* refactored date parsing
* defined public registerDateHandler so callers can add support
    for additional date formats at runtime
* added support for OnBlog, Nate, MSSQL, Greek, and Hungarian dates (ytrewq1)
* added zopeCompatibilityHack() which turns FeedParserDict into a
    regular dictionary, required for Zope compatibility, and also
    makes command-line debugging easier because pprint module
    formats real dictionaries better than dictionary-like objects
* added NonXMLContentType exception, which is stored in bozo_exception
    when a feed is served with a non-XML media type such as
    "text/plain"
* respect Content-Language as default language if not xml:lang is present
* cloud dict is now FeedParserDict
* generator dict is now FeedParserDict
* better tracking of xml:lang, including support for xml:lang=""
    to unset the current language
* recognize RSS 1.0 feeds even when RSS 1.0 namespace is not the
    default namespace
* don't overwrite final status on redirects (scenarios: redirecting
    to a URL that returns 304, redirecting to a URL that redirects
    to another URL with a different type of redirect)
* add support for HTTP 303 redirects
2004-07-17 16:28:29 +00:00
schmonz
028a3f112a Update to 3.1. From the changelog:
* added and passed tests for converting HTML entities to Unicode
    equivalents in illformed feeds (aaronsw)
* added and passed tests for converting character entities to
    Unicode equivalents in illformed feeds (aaronsw)
* test for valid parsers when setting XML_AVAILABLE
* make version and encoding available when server returns a 304
* add handlers parameter to pass arbitrary urllib2 handlers (like
    digest auth or proxy support)
* add code to parse username/password out of url and send as basic
    authentication
* expose downloading-related exceptions in bozo_exception (aaronsw)
* added __contains__ method to FeedParserDict (aaronsw)
* added publisher_detail (aaronsw)
2004-06-30 20:17:35 +00:00
schmonz
7ee5e70b41 Import Universal Feed Parser 3.0.1.
Universal Feed Parser is a Python module for downloading and parsing
syndicated feeds. It can handle RSS 0.90, Netscape RSS 0.91, Userland
RSS 0.91, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom,
and CDF feeds.

To use Universal Feed Parser, you will need Python 2.1 or later.
Universal Feed Parser is not meant to run standalone; it is a module
for you to use as part of a larger Python program.

Universal Feed Parser is easy to use; the module is self-contained
in a single file, feedparser.py, and it has only one public function,
parse. parse takes a number of arguments, but only one is required,
and it can be a URL, a local filename, or a raw string containing
feed data in any format.
2004-06-27 06:31:20 +00:00