Releases
0.4.0
Compare
Sorry, something went wrong.
No results found
0.4.0
Added generic partition brick that detects the file type and routes a file to the appropriate
partitioning brick.
Added a file type detection module.
Updated partition_html and partition_eml to support file-like objects in 'rb' mode.
Cleaning brick for removing ordered bullets clean_ordered_bullets.
Extract brick method for ordered bullets extract_ordered_bullets.
Test for clean_ordered_bullets.
Test for extract_ordered_bullets.
Added partition_docx for pre-processing Word Documents.
Added new REGEX patterns to extract email header information
Added new functions to extract header information parse_received_data and partition_header
Added new function to parse plain text files partition_text
Added new cleaners functions extract_ip_address, extract_ip_address_name, extract_mapi_id, extract_datetimetz
Add new Image element and function to find embedded images find_embedded_images
Added get_directory_file_info for summarizing information about source documents
You can’t perform that action at this time.