- Fix bug where hyphens didn't show up at the end of lines
- Improve wrapping for hyphens - join words across hyphens before newline (disable by passing `keep_hyphens`)
- Restructure output to avoid redundant info in json blob - keep track of text spans with similar font info instead of individual characters
- Update model to predict blocks more accurately