Commit graph

5 commits

Author SHA1 Message Date
gls
4db336045e Adjust HOMEPAGE 2012-01-21 16:52:44 +00:00
adam
c2f7be2c04 Changes 0.90:
* Parses valid and invalid HTML documents to a tree
* Support for minidom, ElementTree (including cElementTree and lxml.etree),
  BeautifulSoup (deprecated) and custom simpletree output formats
* DOM to SAX converter
* Reports parse errors
* Character encoding detection
* Filtering and serializing of trees
* HTML+CSS sanitizer
* Many unit tests
2011-04-15 08:42:03 +00:00
joerg
f3e005ba5e Update to html5lib-0.11.1. No detailed changes. 2009-10-19 10:57:40 +00:00
joerg
39c828b6a6 Remove @dirrm entries from PLISTs 2009-06-14 18:17:11 +00:00
joerg
74b1897174 Import py-html5lib-0.11:
html5lib is a pure-python library for parsing HTML. The parser is
designed to handle all flavours of HTML and  parses invalid documents
using well-defined error handling rules compatible with the behaviour of
major desktop web browsers.

Output is to a tree structure; the current release supports output to
DOM, ElementTree, lxml and BeautifulSoup tree formats as well as a
simple custom format.
2009-01-27 17:27:07 +00:00