0.16.21
Enhancements
-
Use password to load PDF with all modes
-
use vectorized logic to merge inferred and extracted layouts. Using the new
LayoutElementsdata structure and numpy library to refactor the layout merging logic to improve compute performance as well as making logic more clear -
Add PDF Miner configuration Now PDF Miner can be configured via
pdfminer_line_overlap,pdfminer_word_margin,pdfminer_line_marginandpdfminer_char_marginparameters added to partition method.
Features
Fixes
- Fix file type detection for NDJSON files NDJSON files were being detected as JSON due to having the same mime-type.