2024-02-27: HTML parsing via Azure Document Intelligence
We updated prepdocs.py so that HTML files will be processed by Azure Document Intelligence. Here's a stream demonstrating ingestion of HTML docs. You can just update to latest, put HTML files in the data/ folder, and they will get picked up.
What's Changed
- Add HTML parsing via Azure Document Intelligence by @pamelafox in #1325
Full Changelog: 2024-02-23...2024-02-27