Deduplicate characters, fix encoding.
What's Changed
* Add word level deduplication by iammosespaulr in https://github.com/VikParuchuri/pdftext/pull/35
* Dev by VikParuchuri in https://github.com/VikParuchuri/pdftext/pull/36
**Full Changelog**: https://github.com/VikParuchuri/pdftext/compare/v0.5.1...v0.6.0