You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/source-config.md
+117-1Lines changed: 117 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,4 +3,120 @@ title: Source configuration
3
3
position: 4
4
4
---
5
5
6
-
WIP on Notion.
6
+
Quickwit can insert data into an index from one or multiple sources. When creating an index, sources are declared in the [index config](index-config.md). Additional sources can be added later using the [CLI command](cli.md#source)`quickwit source add`.
7
+
8
+
A source is declared using an object called source config. A source config uniquely identifies and defines a source. It consists of three parameters:
9
+
10
+
- source ID
11
+
- source type
12
+
- source parameters
13
+
14
+
*Source ID*
15
+
16
+
The source ID is a string that uniquely identifies the source within an index. It may only contain uppercase or lowercase ASCII letters, digits, hyphens (`-`), periods (`.`) and underscores (`_`). The source ID must start with a letter and must not be longer than 255 characters.
17
+
18
+
*Source type*
19
+
20
+
The source type designates the kind of source being configured. As of version 0.2, available source types are `file` and `kafka`.
21
+
22
+
*Source parameters*
23
+
24
+
The source parameters indicate how to connect to a data store and are specific to the type of source.
25
+
26
+
## File source
27
+
28
+
A file source reads data from a local file. The file must consist of JSON objects separated by a newline. As of version 0.2, compressed files (bz2, gzip, ...) and remote files (Amazon S3, HTTP, ...) are not supported.
29
+
30
+
### File source parameters
31
+
32
+
| Property | Description | Default value |
33
+
| --- | --- | --- |
34
+
| filepath | Path to a local file consisting of JSON objects separated by a newline. ||
35
+
36
+
*Declaring a file source in an [index config](index-config.md) (YAML)*
37
+
38
+
```yaml
39
+
# Version of the index config file format
40
+
version: 0
41
+
42
+
# Sources
43
+
sources:
44
+
- source_id: my-source-id
45
+
source_type: file
46
+
params:
47
+
filepath: path/to/local/file.json
48
+
49
+
# The rest of your index config here
50
+
# ...
51
+
```
52
+
53
+
*Adding a file source to an index with the [CLI](cli.md#source)*
Finally, note that the [CLI command](clid.md#index)`quickwit index ingest` allows ingesting data directly from a file or the standard input without creating a source beforehand.
60
+
61
+
## Kafka source
62
+
63
+
A Kafka source reads data from a Kafka stream. Each message in the stream must hold a JSON object.
64
+
65
+
### Kafka source parameters
66
+
67
+
The Kafka source consumes a `topic` using the client library [librdkafka](https://github.com/edenhill/librdkafka) and forwards the key-value pairs carried by the parameter `client_params` to the underlying librdkafka consumer. Common `client_params` options are bootstrap servers (`bootstrap.servers`), consumer group ID (`group.id`), or security protocol (`security.protocol`). Please, refer to [Kafka](https://kafka.apache.org/documentation/#consumerconfigs) and [librdkafka](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) documentation pages for more advanced options.
68
+
69
+
| Property | Description | Default value |
70
+
| --- | --- | --- |
71
+
| topic | Name of the topic to consume. ||
72
+
| client_log_level | librdkafka client log level. Possible values are: debug, info, warn, error. | info |
0 commit comments