HTML to Text¶
The gs.group.list.base.html2txt.HTMLConverter
class is a
subclass of HTMLParser.HTMLParser
(or
html.parser.HTMLParser
in Python 3) that produces a
plain-text version of a HTML documents. It is fairly simple,
returning a Unicode version of the HTML, and it is used in the
rare case that a plain-text body is absent from an email message.
The convert_to_txt()
function is a wrapper for convenience.
Example¶
>>> from gs.group.list.base.html2txt import HTMLConverter
>>> converter = HTMLConverter()
>>> html = '<p>Je ne ecrit pas français.</p>'
>>> converter.feed(html)
>>> converter.close()
>>> print(converter)
Je ne ecrit pas français.