11---
2- title : Converters
2+ title : " Converters"
33id : converters-api
4- description : Various converters to transform data from one format to another.
4+ description : " Various converters to transform data from one format to another."
5+ slug : " /converters-api"
56---
67
78<a id =" azure " ></a >
@@ -144,30 +145,32 @@ The deserialized component.
144145
145146Converts CSV files to Documents.
146147
147- By default, it uses UTF - 8 encoding when converting files but
148- you can also set a custom encoding.
149- It can attach metadata to the resulting documents.
148+ By default, it uses UTF - 8 encoding when converting files but
149+ you can also set a custom encoding.
150+ It can attach metadata to the resulting documents.
150151
151- # ## Usage example
152+ # ## Usage example
152153
153- ```python
154- from haystack.components.converters.csv import CSVToDocument
155- converter = CSVToDocument()
156- results = converter.run(sources = [" sample.csv" ], meta = {" date_added" : datetime.now().isoformat()})
157- documents = results[" documents" ]
158- print (documents[0 ].content)
159- # 'col1,col2
160- ow1,row1
161- row2row2
162- '
163- ```
154+ ```python
155+ from haystack.components.converters.csv import CSVToDocument
156+ converter = CSVToDocument()
157+ results = converter.run(sources = [" sample.csv" ], meta = {" date_added" : datetime.now().isoformat()})
158+ documents = results[" documents" ]
159+ print (documents[0 ].content)
160+ # 'col1,col2\nrow1,row1\nrow2,row2\n'
161+ ```
164162
165163<a id =" csv.CSVToDocument.__init__ " ></a >
166164
167165#### CSVToDocument.\_\_ init\_\_
168166
169167``` python
170- def __init__ (encoding : str = " utf-8" , store_full_path : bool = False )
168+ def __init__ (encoding : str = " utf-8" ,
169+ store_full_path : bool = False ,
170+ * ,
171+ conversion_mode : Literal[" file" , " row" ] = " file" ,
172+ delimiter : str = " ," ,
173+ quotechar : str = ' "' )
171174```
172175
173176Creates a CSVToDocument component.
@@ -179,6 +182,10 @@ If the encoding is specified in the metadata of a source ByteStream,
179182it overrides this value.
180183- ` store_full_path ` : If True, the full path of the file is stored in the metadata of the document.
181184If False, only the file name is stored.
185+ - ` conversion_mode ` : - "file" (default): one Document per CSV file whose content is the raw CSV text.
186+ - "row": convert each CSV row to its own Document (requires ` content_column ` in ` run() ` ).
187+ - ` delimiter ` : CSV delimiter used when parsing in row mode (passed to `` csv.DictReader `` ).
188+ - ` quotechar ` : CSV quote character used when parsing in row mode (passed to `` csv.DictReader `` ).
182189
183190<a id =" csv.CSVToDocument.run " ></a >
184191
@@ -187,14 +194,19 @@ If False, only the file name is stored.
187194``` python
188195@component.output_types (documents = list[Document])
189196def run (sources : list[Union[str , Path, ByteStream]],
197+ * ,
198+ content_column : Optional[str ] = None ,
190199 meta : Optional[Union[dict[str , Any], list[dict[str , Any]]]] = None )
191200```
192201
193- Converts a CSV file to a Document.
202+ Converts CSV files to a Document ( file mode) or to one Document per row (row mode) .
194203
195204** Arguments** :
196205
197206- ` sources ` : List of file paths or ByteStream objects.
207+ - ` content_column ` : ** Required when** `` conversion_mode="row" `` .
208+ The column name whose values become `` Document.content `` for each row.
209+ The column must exist in the CSV header.
198210- ` meta ` : Optional metadata to attach to the documents.
199211This value can be either a list of dictionaries or a single dictionary.
200212If it's a single dictionary, its content is added to the metadata of all produced documents.
@@ -1618,3 +1630,4 @@ If `sources` contains ByteStream objects, their `meta` will be added to the outp
16181630
16191631A dictionary with the following keys:
16201632- `documents` : Created documents
1633+
0 commit comments