Skip to content

WIP: feature: OTel resource attributes ingested as native metadata#14195

Draft
aknuds1 wants to merge 20 commits intomainfrom
arve/parquet-metadata-resource-attributes
Draft

WIP: feature: OTel resource attributes ingested as native metadata#14195
aknuds1 wants to merge 20 commits intomainfrom
arve/parquet-metadata-resource-attributes

Conversation

@aknuds1
Copy link
Contributor

@aknuds1 aknuds1 commented Jan 29, 2026

What this PR does

Working on support for OTel resource attributes ingested and persisted as native metadata.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

@aknuds1 aknuds1 changed the title WIP: feature: OTel resource attributes persisted as native metadata WIP: feature: OTel resource attributes ingested as native metadata Jan 29, 2026
@aknuds1 aknuds1 added enhancement New feature or request area/opentelemetry Everything related to OpenTelemetry OTLP OTel labels Jan 29, 2026
@aknuds1 aknuds1 force-pushed the arve/parquet-metadata-resource-attributes branch 2 times, most recently from 3c6c65b to cafff44 Compare January 30, 2026 08:31
aknuds1 and others added 12 commits January 30, 2026 13:19
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
The previous session incorrectly moved ResourceAttributes types to a
separate file without proper protobuf support. This broke wire
serialization between distributor and ingester.

Restored the correct mimir.pb.go from the remote branch which has:
- ResourceAttributes field with protobuf tag for wire serialization
- Marshal/Unmarshal methods for the types
- Getter methods for accessing fields

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement the distributed query infrastructure for querying OTel resource
attributes from ingesters and store-gateways:

- Add Ingester.ResourceAttributes RPC to query TSDB head for resource
  attributes associated with series matching the given matchers
- Add BucketStore.ResourceAttributes stub (returns empty for now)
- Add Distributor.ResourceAttributes to fan out queries to ingesters
  and merge/deduplicate results
- Add /api/v1/resources HTTP endpoint to expose resource attributes
- Add proto definitions for ResourceAttributesRequest/Response messages
- Update Distributor interface and mock implementations for tests

The endpoint accepts match[] selectors and optional start/end time range,
returning series labels with their versioned resource attributes including
identifying attributes, descriptive attributes, and entity information.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Add resourceQuerierCache to wrap queriers and implement the
storage.ResourceQuerier interface required by the info() PromQL function.

The cache lazily fetches resource attributes on first access and provides
point-in-time lookups via GetResourceAt(). Both distributorQuerier and
blocksStoreQuerier are now wrapped to support info() queries.

Key changes:
- Add resource_querier_cache.go with ResourceAttributesFetcher interface
- Add distributorResourceFetcher for ingester queries
- Add blocksResourceFetcher for store-gateway queries
- Wrap queriers in NewResourceQuerierCache in Querier() methods
- Add unit tests for cache functionality
Add a demo script to showcase how Mimir persists OTel resource attributes
from OTLP metrics and makes them queryable via /api/v1/resources endpoint.

The demo demonstrates:
- OTLP metric ingestion with resource attributes
- Querying attributes from ingester head (in-memory)
- Flushing to blocks with series_metadata.parquet
- Querying attributes from store-gateways
- Versioned attributes tracking infrastructure changes

Changes:
- Enable otel_persist_resource_attributes in runtime.yaml
- Add scripts/otlp-resource-attrs-demo.sh demo script
- Add scripts/README.md documentation
- Update development/README.md with demo instructions
The ResourceAttributes query was returning empty results because it used
labels.Hash() to look up stored resource attributes, but TSDB Head.SeriesMetadata
stores them using labels.StableHash(). These hash functions produce different
values for the same labels, causing the lookup to fail.

Also registers the /api/v1/resources endpoint in the API and adds demo tenant
config for testing resource attributes persistence.
The resource attributes handler now queries both the ingester (for recent
data in TSDB head) and the store-gateway (for data flushed to blocks),
allowing resource attributes to remain queryable after data ages out of
the ingester.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates the OTLP resource attributes demo to match the Prometheus
documentation example at documentation/examples/otlp-resource-attributes.

Changes:
- Add entity_refs to OTLP payloads (service + host entities)
- Add Phase 6: info() function demonstration with time-varying attributes
- Add Phase 7: full API response format display
- Update phase numbering to 8 phases matching Prometheus
- Add query_promql() helper for PromQL instant queries
- Update README with entity_refs documentation and architecture diagram

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
…ce attributes

This commit fixes several issues preventing the info() PromQL function from
enriching query results with OTel resource attributes (service.name, host.name,
cloud.region, etc.):

1. Hash consistency: Use labels.StableHash consistently across the distributed
   query path (distributor, store-gateway, querier cache) to match the hash
   used when storing resource attributes in the ingester.

2. Context propagation: Add stored context mechanism to resourceQuerierCache
   to capture the tenant ID from Select() calls, since the ResourceQuerier
   interface's GetResourceAt() method doesn't accept a context parameter.

3. Interface delegation: Implement ResourceQuerier interface on all wrapper
   types in the query path (errorTranslateQuerier, multiQuerier, LazyQuerier)
   to enable the interface to propagate through the wrapper chain.

4. Matcher fix: Use an "all series" matcher ({__name__!=""}) in cache fetchers
   instead of nil matchers, as nil matchers cause getPostings to return empty
   results.

5. Config: Enable otel_persist_resource_attributes in monolithic mode config.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for uploading series_metadata.parquet files (containing
metric metadata and OTel resource attributes) to object storage during
block upload.

Changes:
- Add SeriesMetadataFilename constant for the parquet filename
- Add FileTypeSeriesMetadata for error reporting
- Update GatherFileStats() to include series_metadata.parquet in stats
- Update Upload() to upload series_metadata.parquet (when present)
- Update tests with correct expected file counts

This enables the store-gateway to serve resource attributes from
persisted blocks, allowing the info() PromQL function to work for
historical data (not just TSDB head data).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Reorder imports to comply with gci linter requirements:
- Standard library imports first
- Third-party imports (sorted alphabetically)
- Internal imports (github.com/grafana/mimir)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the disabled info.test to match the upstream Prometheus version
which now uses resource attributes instead of info metric joining.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@aknuds1 aknuds1 force-pushed the arve/parquet-metadata-resource-attributes branch from b8adc97 to 6a6c21e Compare January 30, 2026 12:27
aknuds1 and others added 4 commits January 30, 2026 14:33
The import was accidentally changed from gopkg.in/yaml.v3 to
go.yaml.in/yaml/v3 which broke the yaml replace directive in
go.mod, causing mimirtool rules check to fail.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the expected diff file to include the ResourceAttributes
code that was manually added to mimir.pb.go.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace gopkg.in/yaml.v3 (colega fork) with go.yaml.in/yaml/v3 (Grafana
fork) across the entire codebase. Both forks include the same patches
from PRs #691 and #876, but this consolidates on a single YAML library.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jan 30, 2026

aknuds1 and others added 4 commits January 31, 2026 10:24
Add UpdateResource call to TSDBBuilder.PushToStorageAndReleaseRequest
to persist OTel resource attributes when processing timeseries from
Kafka. This ensures resource attributes from OTLP ingestion are written
to the series_metadata.parquet file during block compaction.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a demo script for the ingest storage architecture that showcases
OTel resource attribute persistence. This mirrors the monolithic mode
demo but adapts for the Kafka -> Block Builder flow:

- Enable otel_persist_resource_attributes in ingest storage runtime config
- Create demo script that waits for block builder instead of flushing
- Add X-Scope-OrgID headers for multi-tenant gateway compatibility
- Update messaging to explain ingest storage architecture

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Allow the OTel resource attributes demo to run fully non-interactively
by adding optional flags to manage the docker-compose stack lifecycle.

--start-stack: Starts the stack, waits for services to be ready, and
handles the nginx/grafana port conflict by starting nginx separately.

--stop-stack: Registers a cleanup trap to stop the stack on exit,
including when the script fails or is interrupted.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Port the stack management flags from the ingest storage demo to the
monolithic mode demo script. This enables fully automated demo runs.

Also fix macOS compatibility by replacing `head -n -1` with `sed '$d'`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/opentelemetry Everything related to OpenTelemetry OTLP OTel enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant