Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/skywalking.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1019,12 +1019,12 @@ jobs:
found=false
for i in {1..60}; do
# check if segment files exist
if docker exec $CONTAINER_ID sh -c '[ -n "$(ls /tmp/measure-data/measure/data/day/seg* 2>/dev/null)" ]'; then
if docker exec $CONTAINER_ID sh -c '[ -n "$(ls /tmp/measure-data/measure/data/metricsDay/seg* 2>/dev/null)" ]'; then
echo "✅ found segment files"
sleep 180
# create and copy files
docker cp $CONTAINER_ID:/tmp ${BANYANDB_DATA_GENERATE_ROOT}
docker cp $CONTAINER_ID:/tmp/measure-data/measure/data/index ${BANYANDB_DATA_GENERATE_ROOT}
docker cp $CONTAINER_ID:/tmp/measure-data/measure/data/metadata ${BANYANDB_DATA_GENERATE_ROOT}
found=true
break
else
Expand Down
2 changes: 2 additions & 0 deletions docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
* Adapt the mesh metrics if detect the ambient mesh in the eBPF access log receiver.
* Add JSON format support for the `/debugging/config/dump` status API.
* Enhance status APIs to support multiple `accept` header values, e.g. `Accept: application/json; charset=utf-8`.
* Storage: separate `SpanAttachedEventRecord` for SkyWalking trace and Zipkin trace.
* [Break Change]BanyanDB: Setup new Group policy.

#### UI

Expand Down
224 changes: 136 additions & 88 deletions docs/en/setup/backend/configuration-vocabulary.md

Large diffs are not rendered by default.

141 changes: 77 additions & 64 deletions docs/en/setup/backend/storages/banyandb.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,21 +32,6 @@ storage:
Since 10.2.0, the banyandb configuration is separated to an independent configuration file: `bydb.yaml`:

```yaml
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

global:
# Targets is the list of BanyanDB servers, separated by commas.
# Each target is a BanyanDB server in the format of `host:port`.
Expand Down Expand Up @@ -82,9 +67,15 @@ global:

groups:
# The group settings of record.
# - "ShardNum": Number of shards in the group. Shards are the basic units of data storage in BanyanDB. Data is distributed across shards based on the hash value of the series ID.
# Refer to the [BanyanDB Shard](https://skywalking.apache.org/docs/skywalking-banyandb/latest/concept/clustering/#52-data-sharding) documentation for more details.
# - "SIDays": Interval in days for creating a new segment. Segments are time-based, allowing efficient data retention and querying. `SI` stands for Segment Interval.
# - "TTLDays": Time-to-live for the data in the group, in days. Data exceeding the TTL will be deleted.
#
# The "normal" section defines settings for datasets not specified in "super".
# Each dataset will be grouped under a single group named "normal".
# For more details on setting `segmentIntervalDays` and `ttlDays`, refer to the [BanyanDB TTL](https://skywalking.apache.org/docs/main/latest/en/banyandb/ttl) documentation.

# The "recordsNormal" section defines settings for datasets not specified in records.
# Each dataset will be grouped under a single group named "recordsNormal".
recordsNormal:
# The settings for the default "hot" stage.
shardNum: ${SW_STORAGE_BANYANDB_GR_NORMAL_SHARD_NUM:1}
Expand All @@ -108,24 +99,72 @@ groups:
segmentInterval: ${SW_STORAGE_BANYANDB_GR_NORMAL_COLD_SI_DAYS:3}
ttl: ${SW_STORAGE_BANYANDB_GR_NORMAL_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_NORMAL_COLD_NODE_SELECTOR:"type=cold"}
# "super" is a special dataset designed to store trace or log data that is too large for normal datasets.
# Each super dataset will be a separate group in BanyanDB, following the settings defined in the "super" section.
recordsSuper:
shardNum: ${SW_STORAGE_BANYANDB_GR_SUPER_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_SUPER_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_SUPER_TTL_DAYS:3}
enableWarmStage: ${SW_STORAGE_BANYANDB_GR_SUPER_ENABLE_WARM_STAGE:false}
enableColdStage: ${SW_STORAGE_BANYANDB_GR_SUPER_ENABLE_COLD_STAGE:false}
# The group settings of super datasets.
# Super datasets are used to store trace or log data that is too large for normal datasets.
recordsTrace:
shardNum: ${SW_STORAGE_BANYANDB_GR_TRACE_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_TRACE_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_TRACE_TTL_DAYS:3}
enableWarmStage: ${SW_STORAGE_BANYANDB_GR_TRACE_ENABLE_WARM_STAGE:false}
enableColdStage: ${SW_STORAGE_BANYANDB_GR_TRACE_ENABLE_COLD_STAGE:false}
warm:
shardNum: ${SW_STORAGE_BANYANDB_GR_TRACE_WARM_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_TRACE_WARM_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_TRACE_WARM_TTL_DAYS:7}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_TRACE_WARM_NODE_SELECTOR:"type=warm"}
cold:
shardNum: ${SW_STORAGE_BANYANDB_GR_TRACE_COLD_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_TRACE_COLD_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_TRACE_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_TRACE_COLD_NODE_SELECTOR:"type=cold"}
recordsZipkinTrace:
shardNum: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_TTL_DAYS:3}
enableWarmStage: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_ENABLE_WARM_STAGE:false}
enableColdStage: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_ENABLE_COLD_STAGE:false}
warm:
shardNum: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_WARM_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_WARM_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_WARM_TTL_DAYS:7}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_WARM_NODE_SELECTOR:"type=warm"}
cold:
shardNum: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_COLD_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_COLD_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_ZIPKIN_TRACE_COLD_NODE_SELECTOR:"type=cold"}
recordsLog:
shardNum: ${SW_STORAGE_BANYANDB_GR_LOG_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_LOG_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_LOG_TTL_DAYS:3}
enableWarmStage: ${SW_STORAGE_BANYANDB_GR_LOG_ENABLE_WARM_STAGE:false}
enableColdStage: ${SW_STORAGE_BANYANDB_GR_LOG_ENABLE_COLD_STAGE:false}
warm:
shardNum: ${SW_STORAGE_BANYANDB_GR_LOG_WARM_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_LOG_WARM_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_LOG_WARM_TTL_DAYS:7}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_LOG_WARM_NODE_SELECTOR:"type=warm"}
cold:
shardNum: ${SW_STORAGE_BANYANDB_GR_LOG_COLD_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_LOG_COLD_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_LOG_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_LOG_COLD_NODE_SELECTOR:"type=cold"}
recordsBrowserErrorLog:
shardNum: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_TTL_DAYS:3}
enableWarmStage: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_ENABLE_WARM_STAGE:false}
enableColdStage: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_ENABLE_COLD_STAGE:false}
warm:
shardNum: ${SW_STORAGE_BANYANDB_GR_SUPER_WARM_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_SUPER_WARM_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_SUPER_WARM_TTL_DAYS:7}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_SUPER_WARM_NODE_SELECTOR:"type=warm"}
shardNum: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_WARM_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_WARM_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_WARM_TTL_DAYS:7}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_WARM_NODE_SELECTOR:"type=warm"}
cold:
shardNum: ${SW_STORAGE_BANYANDB_GR_SUPER_COLD_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_SUPER_COLD_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_SUPER_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_SUPER_COLD_NODE_SELECTOR:"type=cold"}
shardNum: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_COLD_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_COLD_SI_DAYS:1}
ttl: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_COLD_TTL_DAYS:30}
nodeSelector: ${SW_STORAGE_BANYANDB_GR_BROWSER_ERROR_LOG_COLD_NODE_SELECTOR:"type=cold"}
# The group settings of metrics.
#
# OAP stores metrics based its granularity.
Expand Down Expand Up @@ -180,14 +219,14 @@ groups:
segmentInterval: ${SW_STORAGE_BANYANDB_GM_DAY_COLD_SI_DAYS:15}
ttl: ${SW_STORAGE_BANYANDB_GM_DAY_COLD_TTL_DAYS:120}
nodeSelector: ${SW_STORAGE_BANYANDB_GM_DAY_COLD_NODE_SELECTOR:"type=cold"}
# If the metrics is marked as "index_mode", the metrics will be stored in the "index" group.
# The "index" group is designed to store metrics that are used for indexing without value columns.
# If the metrics is marked as "index_mode", the metrics will be stored in the "metadata" group.
# The "metadata" group is designed to store metrics that are used for indexing without value columns.
# Such as `service_traffic`, `network_address_alias`, etc.
# "index_mode" requires BanyanDB *0.8.0* or later.
metadata:
shardNum: ${SW_STORAGE_BANYANDB_GM_INDEX_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GM_INDEX_SI_DAYS:15}
ttl: ${SW_STORAGE_BANYANDB_GM_INDEX_TTL_DAYS:15}
shardNum: ${SW_STORAGE_BANYANDB_GM_METADATA_SHARD_NUM:2}
segmentInterval: ${SW_STORAGE_BANYANDB_GM_METADATA_SI_DAYS:15}
ttl: ${SW_STORAGE_BANYANDB_GM_METADATA_TTL_DAYS:15}

# The group settings of property, such as UI and profiling.
property:
Expand Down Expand Up @@ -232,30 +271,4 @@ docker run -d \
- **Cluster Mode**: Suitable for large-scale deployments.
- **Configuration**: `targets` is the IP address/hostname and port of the `liaison` nodes, separated by commas. `Liaison` nodes are the entry points of the BanyanDB cluster.

### Group Settings

BanyanDB supports **group settings** to configure storage groups, shards, segment intervals, and TTL (Time-To-Live). The group settings file is a YAML file required when using BanyanDB as the storage.

#### Basic Group Settings

- `ShardNum`: Number of shards in the group. Shards are the basic units of data storage in BanyanDB. Data is distributed across shards based on the hash value of the series ID. Refer to the [BanyanDB Shard](https://skywalking.apache.org/docs/skywalking-banyandb/latest/concept/clustering/#52-data-sharding) documentation for more details.
- `SIDays`: Interval in days for creating a new segment. Segments are time-based, allowing efficient data retention and querying. `SI` stands for Segment Interval.
- `TTLDays`: Time-to-live for the data in the group, in days. Data exceeding the TTL will be deleted.

For more details on setting `segmentIntervalDays` and `ttlDays`, refer to the [BanyanDB TTL](../../../banyandb/ttl.md) documentation.

#### Record Group Settings

The `gr` prefix is used for record group settings. The `normal` and `super` sections are used to define settings for normal and super datasets, respectively.

Super datasets are used to store trace or log data that is too large for normal datasets. Each super dataset is stored in a separate group in BanyanDB. The settings defined in the `super` section are applied to all super datasets.

Normal datasets are stored in a single group named `normal`. The settings defined in the `normal` section are applied to all normal datasets.

#### Metrics Group Settings

The `gm` prefix is used for metrics group settings. The `minute`, `hour`, and `day` sections are used to define settings for metrics stored based on granularity.

The `index` group is designed to store metrics used for indexing without value columns. For example, `service_traffic`, `network_address_alias`, etc.

For more details, refer to the documentation of [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/latest/readme/) and the [BanyanDB Java Client](https://github.com/apache/skywalking-banyandb-java-client) subprojects.
50 changes: 31 additions & 19 deletions docs/en/status/query_ttl_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ This API is used to get the unified and effective TTL configurations.
```shell
> curl -X GET "http://127.0.0.1:12800/status/config/ttl"
# Metrics TTL includes the definition of the TTL of the metrics-ish data in the storage,
# Metrics TTL includes the definition of the TTL of the metrics-ish data in the storage,
# e.g.
# 1. The metadata of the service, instance, endpoint, topology map, etc.
# 2. Generated metrics data from OAL and MAL engines.
Expand All @@ -34,12 +35,17 @@ metrics.day.cold=-1
# Super dataset of records are traces and logs, which volume should be much larger.
#
# Cover hot and warm data for BanyanDB.
records.default=3
records.superDataset=3
records.normal=3
records.trace=10
records.zipkinTrace=3
records.log=3
records.browserErrorLog=3
# Cold data, '-1' represents no cold stage data.
records.default.cold=-1
records.superDataset.cold=-1

records.normal.cold=-1
records.trace.cold=30
records.zipkinTrace.cold=-1
records.log.cold=-1
records.browserErrorLog.cold=-1
```

This API also provides the response in JSON format, which is more friendly for programmatic usage.
Expand All @@ -49,19 +55,25 @@ This API also provides the response in JSON format, which is more friendly for p
-H "Accept: application/json"

{
"metrics": {
"minute": 7,
"hour": 15,
"day": 15,
"coldMinute": -1,
"coldHour": -1,
"coldDay": -1
},
"records": {
"default": 3,
"superDataset": 3,
"coldValue": -1,
"coldSuperDataset": -1
}
"metrics": {
"minute": 7,
"hour": 15,
"day": 15,
"coldMinute": -1,
"coldHour": -1,
"coldDay": -1
},
"records": {
"normal": 3,
"trace": 10,
"zipkinTrace": 3,
"log": 3,
"browserErrorLog": 3,
"coldNormal": -1,
"coldTrace": 30,
"coldZipkinTrace": -1,
"coldLog": -1,
"coldBrowserErrorLog": -1
}
}
```
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
@Stream(name = LogRecord.INDEX_NAME, scopeId = DefaultScopeDefine.LOG, builder = LogRecord.Builder.class, processor = RecordStreamProcessor.class)
@SQLDatabase.ExtraColumn4AdditionalEntity(additionalTable = AbstractLogRecord.ADDITIONAL_TAG_TABLE, parentColumn = TIME_BUCKET)
@BanyanDB.TimestampColumn(AbstractLogRecord.TIMESTAMP)
@BanyanDB.Group(streamGroup = BanyanDB.StreamGroup.RECORDS_LOG)
public class LogRecord extends AbstractLogRecord {

public static final String INDEX_NAME = "log";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
@Stream(name = SegmentRecord.INDEX_NAME, scopeId = DefaultScopeDefine.SEGMENT, builder = SegmentRecord.Builder.class, processor = RecordStreamProcessor.class)
@SQLDatabase.ExtraColumn4AdditionalEntity(additionalTable = SegmentRecord.ADDITIONAL_TAG_TABLE, parentColumn = TIME_BUCKET)
@BanyanDB.TimestampColumn(SegmentRecord.START_TIME)
@BanyanDB.Group(streamGroup = BanyanDB.StreamGroup.RECORDS_TRACE)
public class SegmentRecord extends Record {

public static final String INDEX_NAME = "segment";
Expand Down
Loading
Loading