15 lines
768 B
Text
15 lines
768 B
Text
This module is a bare-bones HTML parser. It is similar in concept to
|
|
HTML::Parser, but it differs in a couple of important ways.
|
|
|
|
First, HTML::SimpleParse just finds tags and text in the HTML you give it;
|
|
it does not care about the specific content of these tags (though it does
|
|
distinguish between different _types_ of tags, such as comments, starting
|
|
tags like <b>, ending tags like </b>, and so on).
|
|
|
|
Second, HTML::SimpleParse does not create a hierarchical tree of HTML
|
|
content, but rather a simple linear list. It does not pay any attention to
|
|
balancing start tags with corresponding end tags, or which pairs of tags
|
|
are inside other pairs of tags.
|
|
|
|
Because of these characteristics, you can make a very effective HTML filter
|
|
by sub-classing HTML::SimpleParse.
|