We’ve removed pdftext's reliance on the decision tree for segmenting spans, lines, and blocks and are now utilizing simpler heuristics for more efficient and accurate segmentation.
0.3.20
Special chars don't work well with the loose charbox. We'll remove loose entirely soon, but this is an intermediate fix for an annoying issue with misplaced quotes.
0.3.19
Close the PDF documents properly to avoid warnings + memory leaks.
0.3.18
Ensure it flattens when multiprocessing
0.3.17
There were some cases where visual and text coordinates didn't align. This fixes that issue.