|
1 | | - |
2 | 1 | ## BanyanDB |
3 | | -[BanyanDB](https://github.com/apache/skywalking-banyandb) is a dedicated storage implementation developed by the SkyWalking Team and the community. |
4 | | -Activate BanyanDB as the storage, and set storage provider to **banyandb**. |
5 | 2 |
|
6 | | -The OAP requires BanyanDB 0.7 server. From this version, BanyanDB provides general compatibility. |
| 3 | +[BanyanDB](https://github.com/apache/skywalking-banyandb) is a dedicated storage implementation developed by the SkyWalking Team and the community. Activate BanyanDB as the storage by setting the storage provider to **banyandb**. |
| 4 | + |
| 5 | +The OAP requires BanyanDB version **0.8** or later. From this version onwards, BanyanDB provides general compatibility. |
| 6 | + |
| 7 | +### Configuration |
7 | 8 |
|
8 | 9 | ```yaml |
9 | 10 | storage: |
10 | 11 | banyandb: |
11 | 12 | # Targets is the list of BanyanDB servers, separated by commas. |
12 | | - # Each target is a BanyanDB server in the format of `host:port` |
13 | | - # If the BanyanDB is deployed as a standalone server, the target should be the IP address or domain name and port of the BanyanDB server. |
14 | | - # If the BanyanDB is deployed in a cluster, the targets should be the IP address or domain name and port of the `liaison` nodes, separated by commas. |
| 13 | + # Each target is a BanyanDB server in the format of `host:port`. |
| 14 | + # If BanyanDB is deployed as a standalone server, the target should be the IP address or domain name and port of the BanyanDB server. |
| 15 | + # If BanyanDB is deployed in a cluster, the targets should be the IP address or domain name and port of the `liaison` nodes, separated by commas. |
15 | 16 | targets: ${SW_STORAGE_BANYANDB_TARGETS:127.0.0.1:17912} |
16 | | - # The max number of records in a bulk write request. |
17 | | - # Bigger value can improve the write performance, but also increase the OAP and BanyanDB Server memory usage. |
| 17 | + |
| 18 | + # The maximum number of records in a bulk write request. |
| 19 | + # A larger value can improve write performance but also increases OAP and BanyanDB Server memory usage. |
18 | 20 | maxBulkSize: ${SW_STORAGE_BANYANDB_MAX_BULK_SIZE:10000} |
| 21 | + |
19 | 22 | # The minimum seconds between two bulk flushes. |
20 | 23 | # If the data in a bulk is less than maxBulkSize, the data will be flushed after this period. |
21 | | - # If the data in a bulk is more than maxBulkSize, the data will be flushed immediately. |
22 | | - # Bigger value can reduce the write pressure on BanyanDB Server, but also increase the latency of the data. |
| 24 | + # If the data in a bulk exceeds maxBulkSize, the data will be flushed immediately. |
| 25 | + # A larger value can reduce write pressure on BanyanDB Server but increase data latency. |
23 | 26 | flushInterval: ${SW_STORAGE_BANYANDB_FLUSH_INTERVAL:15} |
24 | | - # The timeout seconds of a bulk flush. |
| 27 | + |
| 28 | + # The timeout in seconds for a bulk flush. |
25 | 29 | flushTimeout: ${SW_STORAGE_BANYANDB_FLUSH_TIMEOUT:10} |
26 | | - # The shard number of `measure` groups that store the metrics data. |
27 | | - metricsShardsNumber: ${SW_STORAGE_BANYANDB_METRICS_SHARDS_NUMBER:1} |
28 | | - # The shard number of `stream` groups that store the trace, log and profile data. |
29 | | - recordShardsNumber: ${SW_STORAGE_BANYANDB_RECORD_SHARDS_NUMBER:1} |
30 | | - # The multiplier of the number of shards of the super dataset. |
31 | | - # Super dataset is a special dataset that stores the trace or log data that is too large to be stored in the normal dataset. |
32 | | - # If the normal dataset has `n` shards, the super dataset will have `n * superDatasetShardsFactor` shards. |
33 | | - # For example, supposing `recordShardsNumber` is 3, and `superDatasetShardsFactor` is 2, |
34 | | - # `segment-default` is a normal dataset that has 3 shards, and `segment-minute` is a super dataset that has 6 shards. |
35 | | - superDatasetShardsFactor: ${SW_STORAGE_BANYANDB_SUPERDATASET_SHARDS_FACTOR:2} |
| 30 | + |
36 | 31 | # The number of threads that write data to BanyanDB concurrently. |
37 | | - # Bigger value can improve the write performance, but also increase the OAP and BanyanDB Server CPU usage. |
| 32 | + # A higher value can improve write performance but also increases CPU usage on both OAP and BanyanDB Server. |
38 | 33 | concurrentWriteThreads: ${SW_STORAGE_BANYANDB_CONCURRENT_WRITE_THREADS:15} |
39 | | - # The maximum size of dataset when the OAP loads cache, such as network aliases. |
| 34 | + |
| 35 | + # The maximum size of the dataset when the OAP loads cache, such as network aliases. |
40 | 36 | resultWindowMaxSize: ${SW_STORAGE_BANYANDB_QUERY_MAX_WINDOW_SIZE:10000} |
| 37 | + |
41 | 38 | # The maximum size of metadata per query. |
42 | 39 | metadataQueryMaxSize: ${SW_STORAGE_BANYANDB_QUERY_MAX_SIZE:10000} |
43 | | - # The maximum size of trace segments per query. |
| 40 | + |
| 41 | + # The maximum number of trace segments per query. |
44 | 42 | segmentQueryMaxSize: ${SW_STORAGE_BANYANDB_QUERY_SEGMENT_SIZE:200} |
45 | | - # The max number of profile task query in a request. |
| 43 | + |
| 44 | + # The maximum number of profile task queries in a request. |
46 | 45 | profileTaskQueryMaxSize: ${SW_STORAGE_BANYANDB_QUERY_PROFILE_TASK_SIZE:200} |
47 | | - # The batch size of query profiling data. |
| 46 | + |
| 47 | + # The batch size for querying profile data. |
48 | 48 | profileDataQueryBatchSize: ${SW_STORAGE_BANYANDB_QUERY_PROFILE_DATA_BATCH_SIZE:100} |
49 | | - # Data is stored in BanyanDB in segments. A segment is a time range of data. |
50 | | - # The segment interval is the time range of a segment. |
51 | | - # The value should be less or equal to data TTL relevant settings. |
52 | | - segmentIntervalDays: ${SW_STORAGE_BANYANDB_SEGMENT_INTERVAL_DAYS:1} |
53 | | - # The super dataset segment interval is the time range of a segment in the super dataset. |
54 | | - superDatasetSegmentIntervalDays: ${SW_STORAGE_BANYANDB_SUPER_DATASET_SEGMENT_INTERVAL_DAYS:1} |
55 | | - # Specific groups settings. |
56 | | - # For example, {"group1": {"blockIntervalHours": 4, "segmentIntervalDays": 1}} |
57 | | - # Please refer to https://github.com/apache/skywalking-banyandb/blob/${BANYANDB_RELEASE}/docs/crud/group.md#create-operation |
58 | | - # for group setting details. |
59 | | - specificGroupSettings: ${SW_STORAGE_BANYANDB_SPECIFIC_GROUP_SETTINGS:""} |
60 | | - # If the BanyanDB server is configured with TLS, config the TLS cert file path and open tls connection. |
| 49 | + |
| 50 | + # If the BanyanDB server is configured with TLS, configure the TLS cert file path and enable TLS connection. |
61 | 51 | sslTrustCAPath: ${SW_STORAGE_BANYANDB_SSL_TRUST_CA_PATH:""} |
| 52 | + |
| 53 | + # The group settings of record. |
| 54 | + # `gr` is the short name of the group settings of record. |
| 55 | + # |
| 56 | + # The "normal" section defines settings for datasets not specified in "super". |
| 57 | + # Each dataset will be grouped under a single group named "normal". |
| 58 | + grNormalShardNum: ${SW_STORAGE_BANYANDB_GR_NORMAL_SHARD_NUM:1} |
| 59 | + grNormalSIDays: ${SW_STORAGE_BANYANDB_GR_NORMAL_SI_DAYS:1} |
| 60 | + grNormalTTLDays: ${SW_STORAGE_BANYANDB_GR_NORMAL_TTL_DAYS:3} |
| 61 | + # "super" is a special dataset designed to store trace or log data that is too large for normal datasets. |
| 62 | + # Each super dataset will be a separate group in BanyanDB, following the settings defined in the "super" section. |
| 63 | + grSuperShardNum: ${SW_STORAGE_BANYANDB_GR_SUPER_SHARD_NUM:2} |
| 64 | + grSuperSIDays: ${SW_STORAGE_BANYANDB_GR_SUPER_SI_DAYS:1} |
| 65 | + grSuperTTLDays: ${SW_STORAGE_BANYANDB_GR_SUPER_TTL_DAYS:3} |
| 66 | + |
| 67 | + # The group settings of metrics. |
| 68 | + # `gm` is the short name of the group settings of metrics. |
| 69 | + # |
| 70 | + # OAP stores metrics based its granularity. |
| 71 | + # Valid values are "day", "hour", and "minute". That means metrics will be stored in the three separate groups. |
| 72 | + # Non-"minute" are governed by the "core.downsampling" setting. |
| 73 | + # For example, if "core.downsampling" is set to "hour", the "hour" will be used, while "day" are ignored. |
| 74 | + gmMinuteShardNum: ${SW_STORAGE_BANYANDB_GM_MINUTE_SHARD_NUM:2} |
| 75 | + gmMinuteSIDays: ${SW_STORAGE_BANYANDB_GM_MINUTE_SI_DAYS:1} |
| 76 | + gmMinuteTTLDays: ${SW_STORAGE_BANYANDB_GM_MINUTE_TTL_DAYS:7} |
| 77 | + gmHourShardNum: ${SW_STORAGE_BANYANDB_GM_HOUR_SHARD_NUM:1} |
| 78 | + gmHourSIDays: ${SW_STORAGE_BANYANDB_GM_HOUR_SI_DAYS:1} |
| 79 | + gmHourTTLDays: ${SW_STORAGE_BANYANDB_GM_HOUR_TTL_DAYS:15} |
| 80 | + gmDayShardNum: ${SW_STORAGE_BANYANDB_GM_DAY_SHARD_NUM:1} |
| 81 | + gmDaySIDays: ${SW_STORAGE_BANYANDB_GM_DAY_SI_DAYS:1} |
| 82 | + gmDayTTLDays: ${SW_STORAGE_BANYANDB_GM_DAY_TTL_DAYS:30} |
| 83 | + # If the metrics is marked as "index_mode", the metrics will be stored in the "index" group. |
| 84 | + # The "index" group is designed to store metrics that are used for indexing without value columns. |
| 85 | + # Such as `service_traffic`, `network_address_alias`, etc. |
| 86 | + # "index_mode" requires BanyanDB *0.8.0* or later. |
| 87 | + gmIndexShardNum: ${SW_STORAGE_BANYANDB_GM_INDEX_SHARD_NUM:1} |
| 88 | + gmIndexSIDays: ${SW_STORAGE_BANYANDB_GM_INDEX_SI_DAYS:1} |
| 89 | + gmIndexTTLDays: ${SW_STORAGE_BANYANDB_GM_INDEX_TTL_DAYS:30} |
| 90 | + |
62 | 91 | ``` |
63 | 92 |
|
64 | | -BanyanDB Server supports two installation modes: standalone and cluster. The standalone mode is suitable for small-scale deployments, while the cluster mode is suitable for large-scale deployments. |
| 93 | +### Installation Modes |
| 94 | + |
| 95 | +BanyanDB Server supports two installation modes: |
| 96 | + |
| 97 | +- **Standalone Mode**: Suitable for small-scale deployments. |
| 98 | + - **Configuration**: `targets` is the IP address/hostname and port of the BanyanDB server. |
| 99 | + |
| 100 | +- **Cluster Mode**: Suitable for large-scale deployments. |
| 101 | + - **Configuration**: `targets` is the IP address/hostname and port of the `liaison` nodes, separated by commas. `Liaison` nodes are the entry points of the BanyanDB cluster. |
| 102 | + |
| 103 | +### Group Settings |
| 104 | + |
| 105 | +BanyanDB supports **group settings** to configure storage groups, shards, segment intervals, and TTL (Time-To-Live). The group settings file is a YAML file required when using BanyanDB as the storage. |
| 106 | + |
| 107 | +#### Basic Group Settings |
| 108 | + |
| 109 | +- `ShardNum`: Number of shards in the group. Shards are the basic units of data storage in BanyanDB. Data is distributed across shards based on the hash value of the series ID. Refer to the [BanyanDB Shard](https://skywalking.apache.org/docs/skywalking-banyandb/latest/concept/clustering/#52-data-sharding) documentation for more details. |
| 110 | +- `SIDays`: Interval in days for creating a new segment. Segments are time-based, allowing efficient data retention and querying. `SI` stands for Segment Interval. |
| 111 | +- `TTLDays`: Time-to-live for the data in the group, in days. Data exceeding the TTL will be deleted. |
65 | 112 |
|
66 | | -* Standalone mode: `targets` is the IP address/host name and port of the BanyanDB server. |
67 | | -* Cluster mode: `targets` is the IP address/host name and port of the `liaison` nodes, separated by commas. `liaison` nodes are the entry points of the BanyanDB cluster. |
| 113 | +For more details on setting `segmentIntervalDays` and `ttlDays`, refer to the [BanyanDB Rotation](https://skywalking.apache.org/docs/skywalking-banyandb/latest/concept/rotation/) documentation. |
68 | 114 |
|
69 | | -For more details, please refer to the documents of [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/latest/readme/) |
70 | | -and [BanyanDB Java Client](https://github.com/apache/skywalking-banyandb-java-client) subprojects. |
| 115 | +For more details, refer to the documentation of [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/latest/readme/) and the [BanyanDB Java Client](https://github.com/apache/skywalking-banyandb-java-client) subprojects. |
0 commit comments