Skip to content

Conversation

@scotwells
Copy link
Contributor

@scotwells scotwells commented Jan 9, 2026

Summary

Optimize Clickhouse database schema for platform-wide and user-specific querying of multi-tenant audit log data.

Details

As we began running performance tests against the activity apiserver, we noticed that platform-wide queries were performing drastically worse than tenant-level queries. See datum-cloud/enhancements#536 (comment) for a comparison.

This was a result of our initial schema being designed to order data by tenant resulting in platform-wide queries scanning the entire data set instead of being able to skip over irrelevant rows.

This change makes several adjustments to the schema to improve querying performance.

  • Moved to daily partitions so that partitions are TTL'd at a finer granularity so older partitions are out sooner.
  • Removes unnecessary skip indexes on fields already present in the ordering of the data. Skip indexes won't provide much performance benefit if the ordering is used.
  • Adds skip indexes for fields used for common querying patterns to help skip over irrelevant rows.
  • Creates new projections that are designed to efficiently query audit logs across all tenants.
    • The platform-wide query projection is designed to support platform administrators querying across all tenants. Queries will be most performant when they query by a specific api group and resource which will be the most common querying pattern for cross-tenant queries.
    • Also introduced a query projection for user-specific queries to help platform administrators query for all audit logs related to a specific user.

I've modified the 001_initial_schema.sql migration instead of adding a new migration because this service has not been released yet.

This PR also contains a few other related changes:

  • Removed stage from the schema and the querying interface since we're only collecting the ResponseComplete stage from the system.
  • Adjusted the apiserver to intelligently change the order by used when querying clickhouse to ensure projections are used based on the query being performed by the end-user.
  • Updated the performance tests have been updated to better reflect real-world querying behavior where the api group / resource are present in the queries.

I also included a few unrelated changes:

  • Upgraded to the v1.9.0 version of our shared actions to resolve an issue with the wrong tag being injected to the kustomize builds.
  • Moves to using a ReplacingMergeTree database engine to ensure that all audit logs are unique. Removing duplicates is a background operation so users may see duplicates if a merge operation hasn't been performed. To help prevent duplicates, I adjusted the NATS configuration and Vector configuration to de-duplicate audit logs based on the audit ID. The audit ID is guaranteed to be unique since we only collect the ResponseComplete stage.

Performance test results

Previous Clickhouse schema

This shows a performance test that was run against the activity system that was focused on tenant-level querying. The graphs show that the activity api would struggle with a small number of platform-level queries (~4 RPS) and queries would immediately begin timing out.

image

New optimized Clickhouse schema

This performance test demonstrates the improvements that were made after the new schema was applied. The graphs show that the performance test was able to reach significantly higher throughout ( ~40 RPS) before queries would begin to time out.

image

Resources


Relates to datum-cloud/enhancements#536

As we began running performance tests against the activity apiserver, we
noticed that platform-wide queries were performing drastically worse
than tenant-level queries.

This was a result of our initial schema being designed to order data by
tenant resulting in platform-wide queries scanning the entire data set
instead of being able to skip over irrelevant rows.

This change makes several adjustments to the schema to improve querying
performance of the clickhouse database.

- Moved to daily partitions so that partitions can be TTL'd each day
  instead of only when the month is over. This should also ensure that
  queries only need to scan fewer partitions because all queries will be
  time-bound.
- Removes unnecessary skip indexes on fields already present in the
  ordering of the data. Skip indexes won't provide much performance
  benefit if the ordering is used.
- Moves to using a ReplacingMergeTree database engine to ensure that
  all audit logs are unique. Removing duplicates is a background
  operation so users _may_ see duplicates if a merge operation hasn't
  been performed. We'll mitigate this in the collection pipeline by
  putting guardrails in place to prevent duplicates from being sent to
  Clickhouse.
- Adds indexes for fields used for common querying patterns to help skip
  over irrelevant rows.
- Creates new projections that are designed to efficiently query audit
  logs across all tenants. The platform-wide query projection is
  designed to support platform administrators querying across all
  tenants. Queries will be most performant when they query by a specific
  api group and resource which will be the most common querying pattern
  for cross-tenant queries. Also introduces a query projection for
  user-specific queries to help platform administrators query for all
  audit logs related to a specific user.

I've modified the 001_initial_schema.sql migration instead of adding a
new migration because this service has not been released yet.

I've also removed `stage` from the schema and the querying interface
since we're only collecting the `ResponseComplete` stage from the
system.

The apiserver has also been adjusted to intelligently change the order
by used when querying clickhouse to ensure projections are used based on
the query being performed by the end-user.

Lastly, the performance tests have been updated to better reflect
real-world querying behavior where the api group / resource are present
in the queries.

See: https://clickhouse.com/docs/data-modeling/projections
We need to configure the merge behavior of projections since we swapped
over to the replacing merge tree engine.
This change adjusts the NATS stream configuration to support a 10 minute
de-duplication window. The NATS message ID has been set to the audit log
ID since the ID will be unique across all audit logs.
@scotwells scotwells force-pushed the fix/improve-platform-wide-query-performance branch from e63a8e6 to 21ce2cf Compare January 9, 2026 22:36
Have to enable JetStream to take advantage of the message_id option.
@scotwells scotwells force-pushed the fix/improve-platform-wide-query-performance branch from 21ce2cf to a5b914a Compare January 9, 2026 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants