Replies: 2 comments 4 replies
-
👋 On Ubuntu 22.04, python 3.8, this works: from haystack.components.routers import FileTypeRouter
file_type_router = FileTypeRouter(mime_types=["text/plain", "application/pdf", "text/markdown", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"])
path = "MYFILE.docx"
print(file_type_router.run([path]))
>>> {'application/vnd.openxmlformats-officedocument.wordprocessingml.document': [PosixPath('MYFILE.docx')]} Can you try it in your system? Am I missing something? |
Beta Was this translation helpful? Give feedback.
3 replies
-
Hey @jlonge4 , @anakin87 and I spoke about this and adding some Would you mind opening a PR for this @jlonge4 ? We'll review and integrate it soon after |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have noticed that in a linux env, for instance an aws lambda (python3.12), the FileTypeRouter will output docx and pptx (or other microsoft based flavors of files) as unclassified unless you first run:
How could we implement checking/adding mime types specified at init time are added at init time, reducing unclassified outputs on legitimate mime types.
Any ideas @anakin87 ?
Beta Was this translation helpful? Give feedback.
All reactions