How to read html files #14939
Unanswered
SarahSohana27
asked this question in
Q&A
Replies: 1 comment 6 replies
-
To properly read and load HTML files into LlamaIndex if from llama_index.readers.file import UnstructuredFileLoader
from pathlib import Path
file_path = Path('./output1.html')
loader = UnstructuredFileLoader(
file_path=file_path, continue_on_failure=False, headers={"User-Agent": "value"}
)
documents = loader.load_data()
# Process the documents as needed
for doc in documents:
print(doc.text) This code initializes the pip install llama-index-readers-file This approach uses the |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have some documents in .html files. How to load them in llama index? I tried UnstructuredReader. It's not working.
Beta Was this translation helpful? Give feedback.
All reactions