Skip to content

Conversation

@ndr-ds
Copy link
Contributor

@ndr-ds ndr-ds commented Sep 22, 2025

Motivation

This backports a few PRs:

Proposal

Backport the PRs

Test Plan

CI

## Motivation

We're adding Pyroscope support back, but using eBPF this time

## Proposal

* Add Pyroscope to our helm chart
* Add a Grafana dashboard with the profiling info

## Test Plan

Deployed a network and saw the Pyroscope profiling data both on the
Pyroscope UI and on Grafana, as well as logs from Pyroscope on Loki:

![Screenshot 2025-09-02 at
19.31.14.png](https://app.graphite.dev/user-attachments/assets/e02b7de5-3359-462c-8eec-f4ccf7b46456.png)

![Screenshot 2025-09-02 at
19.31.04.png](https://app.graphite.dev/user-attachments/assets/cbbd7e08-34c3-4152-bd77-68525734021a.png)

![Screenshot 2025-09-02 at
19.30.55.png](https://app.graphite.dev/user-attachments/assets/00435910-5842-4c09-b4aa-634c6e51eace.png)

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
## Motivation

Optimize RocksDB performance for prefix scans

## Proposal

Enhance the RocksDB backend with several performance optimizations:

- Add optimized `ReadOptions` for prefix scans with async I/O enabled
- Set precise upper bounds for iterators to minimize key traversal
- Improve iterator validity checking with a more robust loop structure
- Configure bloom filters for prefix iteration optimization
- Increase block size from 4KB to 32KB to reduce iterator seeks
- Set up prefix extraction for bloom filter optimization
- Enable memory-mapped files for faster reads
- Configure memtable bloom filters and other performance settings

## Test Plan

Tested this with the benchmarks, saw a performance improvement. Not a
step change improvement, but significant enough to warrant a PR

## Release Plan

Nothing to do / These changes follow the usual release cycle.

## Links

- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
## Motivation

If we want to keep the investigation centered on Grafana, it would be
useful to also have a memory profiling dashboard containing the
flamegraphs

## Proposal

Add a dashboard containing the memory profile flamegraphs to Grafana

## Test Plan

Will update with a screenshot

## Release Plan

- Nothing to do / These changes follow the usual release cycle.
Distributed tracing is a great way to debug different types of issues,
including for example latency issues. So this is something we definitely
want in general, and probably want by default in production as well.

Implement Distributed Tracing using Grafana Tempo. As it is a Grafana
product, it integrates well with it, which is great for us. The
visualizations also seem to be decent.

Deployed a network with this code and the `linera-infra` portion of
this, and everything works as expected, and I can see the latency
breakdowns (I got a really high latency outlier example):

![Screenshot 2025-09-16 at
13.48.59.png](https://app.graphite.dev/user-attachments/assets/98f49272-d04a-4b7e-aa83-c04f90ec7347.png)

I also chose this because it shows we might be waiting in the chain
worker channel's queue for a while here 🤔 which might be worth
investigating, which I'll do next.

- Nothing to do / These changes follow the usual release cycle.
Now that we have distributed tracing (after
#4556), we need more
instrumentation so we have data about more functions in the breakdowns.

Instrument more functions with `telemetry_only` so that we don't get
spammed in our logs, but the spans still get sent to Tempo.

Tested this with #4556,
saw the spans properly show in the breakdowns.

- Nothing to do / These changes follow the usual release cycle.
@ndr-ds ndr-ds self-assigned this Sep 22, 2025
@ndr-ds ndr-ds marked this pull request as ready for review September 22, 2025 22:03
@ndr-ds ndr-ds requested review from Twey, afck, deuszx and ma2bd September 22, 2025 22:03
Copy link
Contributor

@deuszx deuszx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I wouldn't put the RocksDB optimization in the same backport PR though.

@ma2bd
Copy link
Contributor

ma2bd commented Sep 23, 2025

Let me enable landing without squashing on this branch

@ma2bd ma2bd merged commit 84aecfb into testnet_conway Sep 23, 2025
31 checks passed
@ma2bd ma2bd deleted the performance_and_tooling_related_backports branch September 23, 2025 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants