|
| 1 | +## 10.3.0 |
| 2 | + |
| 3 | +#### Project |
| 4 | + |
| 5 | +* Bump up BanyanDB dependency version(server and java-client) to 0.9.0. |
| 6 | +* Fix CVE-2025-54057, restrict and validate url for widgets. |
| 7 | +* Fix `MetricsPersistentWorker`, remove DataCarrier queue from `Hour/Day` dimensions metrics persistent process. |
| 8 | + This is important to reduce memory cost and `Hour/Day` dimensions metrics persistent latency. |
| 9 | +* [Break Change] BanyanDB: support new Trace model. |
| 10 | + |
| 11 | +#### OAP Server |
| 12 | + |
| 13 | +* Implement self-monitoring for BanyanDB via OAP Server. |
| 14 | +* BanyanDB: Support `hot/warm/cold` stages configuration. |
| 15 | +* Fix query continues profiling policies error when the policy is already in the cache. |
| 16 | +* Support `hot/warm/cold` stages TTL query in the status API and graphQL API. |
| 17 | +* PromQL Service: traffic query support `limit` and regex match. |
| 18 | +* Fix an edge case of HashCodeSelector(Integer#MIN_VALUE causes ArrayIndexOutOfBoundsException). |
| 19 | +* Support Flink monitoring. |
| 20 | +* BanyanDB: Support `@ShardingKey` for Measure tags. |
| 21 | +* BanyanDB: Support cold stage data query for metrics/traces/logs. |
| 22 | +* Increase the idle check interval of the message queue to 200ms to reduce CPU usage under low load conditions. |
| 23 | +* Limit max attempts of DNS resolution of Istio ServiceEntry to 3, and do not wait for first resolution result in case the DNS is not resolvable at all. |
| 24 | +* Support analysis waypoint metrics in Envoy ALS receiver. |
| 25 | +* Add Ztunnel component in the topology. |
| 26 | +* [Break Change] Change `compomentId` to `componentIds` in the K8SServiceRelation Scope. |
| 27 | +* Adapt the mesh metrics if detect the ambient mesh in the eBPF access log receiver. |
| 28 | +* Add JSON format support for the `/debugging/config/dump` status API. |
| 29 | +* Enhance status APIs to support multiple `accept` header values, e.g. `Accept: application/json; charset=utf-8`. |
| 30 | +* Storage: separate `SpanAttachedEventRecord` for SkyWalking trace and Zipkin trace. |
| 31 | +* [Break Change]BanyanDB: Setup new Group policy. |
| 32 | +* Bump up commons-beanutils to 1.11.0. |
| 33 | +* Refactor: simplify the `Accept` http header process. |
| 34 | +* [Break Change]Storage: Move `event` from metrics to records. |
| 35 | +* Remove string limitation in Jackson deserializer for ElasticSearch client. |
| 36 | +* Fix `disable.oal` does not work. |
| 37 | +* Enhance the stability of e2e PHP tests and update the PHP agent version. |
| 38 | +* Add component ID for the `dameng` JDBC driver. |
| 39 | +* BanyanDB: Support custom `TopN pre-aggregation` rules configuration in file `bydb-topn.yml`. |
| 40 | +* refactor: implement OTEL handler with SPI for extensibility. |
| 41 | +* chore: add `toString` implementation for `StorageID`. |
| 42 | +* chore: add a warning log when connecting to ES takes too long. |
| 43 | +* Fix the query time range in the metadata API. |
| 44 | +* OAP gRPC-Client support `Health Check`. |
| 45 | +* [Break Change] `health_check_xx` metrics make response 1 represents healthy, 0 represents unhealthy. |
| 46 | +* Bump up grpc to 1.70.0. |
| 47 | +* BanyanDB: support new Index rule type `SKIPPING/TREE`, and update the record `log`'s `trace_id` indexType to `SKIPPING` |
| 48 | +* BanyanDB: remove `index-only` from tag setting. |
| 49 | +* Fix analysis tracing profiling span failure in ES storage. |
| 50 | +* Add UI dashboard for Ruby runtime metrics. |
| 51 | +* Tracing Query Execution HTTP APIs: make the argument `service layer` optional. |
| 52 | +* GraphQL API: metadata, topology, log and trace support query by name. |
| 53 | +* [Break Change] MQE function `sort_values` sorts according to the aggregation result and labels rather than the simple time series values. |
| 54 | +* Self Observability: add `metrics_aggregation_queue_used_percentage` and `metrics_persistent_collection_cached_size` metrics for the OAP server. |
| 55 | +* Optimize metrics aggregate/persistent worker: separate `OAL` and `MAL` workers and consume pools. The dataflow signal drives the new MAL consumer, |
| 56 | + the following table shows the pool size,driven mode and queue size for each worker. |
| 57 | + |
| 58 | +| Worker | poolSize | isSignalDrivenMode | queueChannelSize | queueBufferSize | |
| 59 | +|-------------------------------|------------------------------------------|--------------------|------------------|-----------------| |
| 60 | +| MetricsAggregateOALWorker | Math.ceil(availableProcessors * 2 * 1.5) | false | 2 | 10000 | |
| 61 | +| MetricsAggregateMALWorker | availableProcessors * 2 / 8, at least 1 | true | 1 | 1000 | |
| 62 | +| MetricsPersistentMinOALWorker | availableProcessors * 2 / 8, at least 1 | false | 1 | 2000 | |
| 63 | +| MetricsPersistentMinMALWorker | availableProcessors * 2 / 16, at least 1 | true | 1 | 1000 | |
| 64 | + |
| 65 | +* Bump up netty to 4.2.4.Final. |
| 66 | +* Bump up commons-lang to 3.18.0. |
| 67 | +* BanyanDB: support group `replicas` and `user/password` for basic authentication. |
| 68 | +* BanyanDB: fix Zipkin query missing tag `QUERY`. |
| 69 | +* Fix `IllegalArgumentException: Incorrect number of labels`, tags in the `LogReportServiceHTTPHandler` and `LogReportServiceGrpcHandler` inconsistent with `LogHandler`. |
| 70 | +* BanyanDB: fix Zipkin query by `annotationQuery` |
| 71 | +* HTTP Server: Use the default shared thread pool rather than creating a new event loop thread pool for each server. Remove the `MAX_THREADS` from each server config. |
| 72 | +* Optimize all Armeria HTTP Server(s) to share the `CommonPools` for the whole JVM. |
| 73 | + In the `CommonPools`, the max threads for `EventLoopGroup` is `processor * 2`, and for `BlockingTaskExecutor` is `200` and can be recycled if over the keepAliveTimeMillis (60000L by default). |
| 74 | + Here is a summary of the thread dump without UI query in a simple Kind env deployed by SkyWalking showcase: |
| 75 | + |
| 76 | +| **Thread Type** | **Count** | **Main State** | **Description** | |
| 77 | +|---------------------------------|-----------|-----------------------------|---------------------------------------------------------------------------------------------------------------------------------------| |
| 78 | +| **JVM System Threads** | 12 | RUNNABLE/WAITING | Includes Reference Handler, Finalizer, Signal Dispatcher, Service Thread, C2/C1 CompilerThreads, Sweeper thread, Common-Cleaner, etc. | |
| 79 | +| **Netty I/O Worker Threads** | 32 | RUNNABLE | Threads named "armeria-common-worker-epoll-*", handling network I/O operations. | |
| 80 | +| **gRPC Worker Threads** | 16 | RUNNABLE | Threads named "grpc-default-worker-*". | |
| 81 | +| **HTTP Client Threads** | 4 | RUNNABLE | Threads named "HttpClient-*-SelectorManager". | |
| 82 | +| **Data Consumer Threads** | 47 | TIMED_WAITING (sleeping) | Threads named "DataCarrier.*", used for metrics data consumption. | |
| 83 | +| **Scheduled Task Threads** | 10 | TIMED_WAITING (parking) | Threads named "pool-*-thread-*". | |
| 84 | +| **ForkJoinPool Worker Threads** | 2 | WAITING (parking) | Threads named "ForkJoinPool-*". | |
| 85 | +| **BanyanDB Processor Threads** | 2 | TIMED_WAITING (parking) | Threads named "BanyanDB BulkProcessor". | |
| 86 | +| **gRPC Executor Threads** | 3 | TIMED_WAITING (parking) | Threads named "grpc-default-executor-*". | |
| 87 | +| **JVM GC Threads** | 13 | RUNNABLE | Threads named "GC Thread#*" for garbage collection. | |
| 88 | +| **Other JVM Internal Threads** | 3 | RUNNABLE | Includes VM Thread, G1 Main Marker, VM Periodic Task Thread. | |
| 89 | +| **Attach Listener** | 1 | RUNNABLE | JVM attach listener thread. | |
| 90 | +| **Total** | **158** | - | - | |
| 91 | + |
| 92 | +* BanyanDB: make `BanyanDBMetricsDAO` output `scan all blocks` info log only when the model is not `indexModel`. |
| 93 | +* BanyanDB: fix the `BanyanDBMetricsDAO.multiGet` not work properly in `IndexMode`. |
| 94 | +* BanyanDB: remove `@StoreIDAsTag`, and automatically create a virtual String tag `id` for the SeriesID in `IndexMode`. |
| 95 | +* Remove method `appendMutant` from StorageID. |
| 96 | +* Fix otlp log handler reponse error and otlp span convert error. |
| 97 | +* Fix service_relation source layer in mq entry span analyse. |
| 98 | +* Fix metrics comparison in promql with bool modifier. |
| 99 | +* Add rate limiter for Zipkin trace receiver to limit maximum spans per second. |
| 100 | +* Open `health-checker` module by default due to latest UI changes. Change the default check period to 30s. |
| 101 | +* Refactor Kubernetes coordinator to be more accurate about node readiness. |
| 102 | +* Bump up netty to 4.2.5.Final. |
| 103 | +* BanyanDB: fix log query missing order by condition, and fix missing service id condition when query by instance id or endpoint id. |
| 104 | +* Fix potential NPE in the `AlarmStatusQueryHandler`. |
| 105 | +* Aggregate TopN Slow SQL by service dimension. |
| 106 | +* BanyanDB: support add group prefix (namespace) for BanyanDB groups. |
| 107 | +* BanyanDB: fix when setting `@BanyanDB.TimestampColumn`, the column should not be indexed. |
| 108 | +* OAP Self Observability: make Trace analysis metrics separate by label `protocol`, add Zipkin span dropped metrics. |
| 109 | +* BanyanDB: Move data write logic from BanyanDB Java Client to OAP and support observe metrics for write operations. |
| 110 | +* Self Observability: add write latency metrics for BanyanDB and ElasticSearch. |
| 111 | +* Fix the malfunctioning alarm feature of MAL metrics due to unknown metadata in L2 aggregate worker. |
| 112 | +* Make MAL percentile align with OAL percentile calculation. |
| 113 | +* Update Grafana dashboards for OAP observability. |
| 114 | +* BanyanDB: fix query `getInstance` by instance ID. |
| 115 | +* Support the go agent(0.7.0 release) bundled pprof profiling feature. |
| 116 | +* Service and TCPService source support analyze TLS mode. |
| 117 | +* Library-pprof-parser: feat: add PprofSegmentParser. |
| 118 | +* Storage: feat: add languageType column to ProfileThreadSnapshotRecord. |
| 119 | +* Feat: add go profile analyzer |
| 120 | +* Get Alarm Runtime Status: support query the running status for the whole cluster. |
| 121 | + |
| 122 | +#### UI |
| 123 | + |
| 124 | +* Implement self-monitoring for BanyanDB via UI. |
| 125 | +* Enhance the trace `List/Tree/Table` graph to support displaying multiple refs of spans and distinguishing different parents. |
| 126 | +* Fix: correct the same labels for metrics. |
| 127 | +* Refactor: use the Fetch API to instead of Axios. |
| 128 | +* Support cold stage data for metrics, trace and log. |
| 129 | +* Add route to status API `/debugging/config/dump` in the UI. |
| 130 | +* Implement the Status API on Settings page. |
| 131 | +* Bump vite from 6.2.6 to 6.3.6. |
| 132 | +* Enhance async profiling by adding shorter and custom duration options. |
| 133 | +* Fix select wrong span to analysis in trace profiling. |
| 134 | +* Correct the service list for legends in trace graphs. |
| 135 | +* Correct endpoint topology data to avoid undefined. |
| 136 | +* Fix the snapshot charts unable to display. |
| 137 | +* Bump vue-i18n from 9.14.3 to 9.14.5. |
| 138 | +* Fix split queries for topology to avoid page crash. |
| 139 | +* Self Observability ui-template: Add new panels for monitor `metrics aggregation queue used percentage` and `metrics persistent collection cached size`. |
| 140 | +* test: introduce and set up unit tests in the UI. |
| 141 | +* test: implement comprehensive unit tests for components. |
| 142 | +* refactor: optimize data types for widgets and dashboards. |
| 143 | +* fix: optimize appearing the wrong prompt by pop-up for the HTTP environments in copy function. |
| 144 | +* refactor the configuration view and implement the optional config for displaying timestamp in Log widget. |
| 145 | +* test: implement unit tests for hooks and refactor some types. |
| 146 | +* fix: share OAP proxy servies for different endpoins and use health checked endpoints group. |
| 147 | +* Optimize buttons in time picker component. |
| 148 | +* Optimize the router system and implement unit tests for router. |
| 149 | +* Bump element-plus from 2.9.4 to 2.11.0. |
| 150 | +* Adapt new trace protocol and implement new trace view. |
| 151 | +* Implement Trace page. |
| 152 | +* Support collapsing and expanding for the event widget. |
| 153 | +* UI-template: add BanyanDB and Elasticsearch write latency dashboards for OAP self observability. |
| 154 | + |
| 155 | +#### Documentation |
| 156 | + |
| 157 | +* BanyanDB: Add `Data Lifecycle Stages(Hot/Warm/Cold)` documentation. |
| 158 | +* Add `SWIP-9 Support flink monitoring`. |
| 159 | +* Fix `Metrics Attributes` menu link. |
| 160 | +* Implement the Status API on Settings page. |
| 161 | +* Fix: Add the prefix for http url. |
| 162 | +* Enhance the async-profiling duration options. |
| 163 | +* Enhance the TTL Tab on Setting page. |
| 164 | +* Fix the snapshot charts in alarm page. |
| 165 | +* Fix `Fluent Bit` dead links. |
| 166 | + |
| 167 | +All issues and pull requests are [here](https://github.com/apache/skywalking/milestone/230?closed=1) |
| 168 | + |
0 commit comments