Extract text from HTML.

html_text is a library for extracting text from HTML, with a few handy
features:
- It removes leading and trailing whitespace
- It handles HTML entities
- It uses lxml for parsing
