| word2x is a word doc text stripper with lots of steriods that tries to restore the formatting of the original document. Understanding of the format and OLE doc extraction are on the TODO list. Output formats supported include text (with fancy table handling), *TeX and HTML. Currently engligh documents work best and certain specific hints in the document help a lot too. |