To do a "rough" comparison for word counts on a file using html2text, and a screen scrape of the text by accessing it in a browser, I used the file: /usr/share/synaptic/html/apa.html.
The screen scrape of the text of the .html file was pasted into the file: htmlword.txt, using the text editor...