Replies: 1 comment
-
|
Hi @AndyWDev Microsoft Foundry can store Parquet files as data assets, but it cannot crack and chunk Parquet files for indexing. The indexing pipeline only supports document-style formats (PDF, TXT, HTML, DOCX, CSV, etc.) and fails purely based on the .parquet extension even if the contents are JSON. Workaround Recommended patterns: Parquet → JSONL (1 row = 1 document) → index (best for structured data) So guidance is treat Parquet as a staging/analytics format, not an indexable document format, until Foundry adds first‑class support. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Folks
I'm just learning Microsoft Foundry and have created a connection to pick up a parquet file from an Azure Data Lake Gen2 share. The connection brings in the file ok, but when I try and create an index from it the 'crack and chunk' process fails due to unsupported file format (extension not recognized - the file just contains json in the appropriate format).
Searching through documentation online I can find examples of importing the parquet file and different examples of creating an index from say a PDF or CSV file - but I can't find any examples of creating an index from a parquet file - there seems to be a missing conversion step somewhere.
Does anyone have an example of how to handle the parquet file in order to create an index?
Beta Was this translation helpful? Give feedback.
All reactions