e3d2f24c5a
4.6.1: * Stop data loss when encountering an empty numeric entity, and possibly in other cases. * Preserve XML namespaces introduced inside an XML document, not just the ones introduced at the top level. * Added a new formatter, "html5", which represents void elements as "<element>" rather than "<element/>". * Fixed a problem where the html.parser tree builder interpreted a string like "&foo " as the character entity "&foo;" * Correctly handle invalid HTML numeric character entities which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml. * Improved the warning given when no parser is specified. * When markup contains duplicate elements, a select() call that includes multiple match clauses will match all relevant elements. * Fixed code that was causing deprecation warnings in recent Python 3 versions. * Fixed a Windows crash in diagnose() when checking whether a long markup string is a filename. * Stopped HTMLParser from raising an exception in very rare cases of bad markup. * Fixed a bug where find_all() was not working when asked to find a tag with a namespaced name in an XML document that was parsed as HTML. * You can get finer control over formatting by subclassing bs4.element.Formatter and passing a Formatter instance into (e.g.) encode(). * You can pass a dictionary of `attrs` into BeautifulSoup.new_tag. This makes it possible to create a tag with an attribute like 'name' that would otherwise be masked by another argument of new_tag. * Clarified the deprecation warning when accessing tag.fooTag, to cover the possibility that you might really have been looking for a tag called 'fooTag'. |
||
---|---|---|
.. | ||
DESCR | ||
distinfo | ||
Makefile | ||
PLIST |