WIP: feature: OTel resource attributes ingested as native metadata#14195
Draft
WIP: feature: OTel resource attributes ingested as native metadata#14195
Conversation
3c6c65b to
cafff44
Compare
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
The previous session incorrectly moved ResourceAttributes types to a separate file without proper protobuf support. This broke wire serialization between distributor and ingester. Restored the correct mimir.pb.go from the remote branch which has: - ResourceAttributes field with protobuf tag for wire serialization - Marshal/Unmarshal methods for the types - Getter methods for accessing fields Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement the distributed query infrastructure for querying OTel resource attributes from ingesters and store-gateways: - Add Ingester.ResourceAttributes RPC to query TSDB head for resource attributes associated with series matching the given matchers - Add BucketStore.ResourceAttributes stub (returns empty for now) - Add Distributor.ResourceAttributes to fan out queries to ingesters and merge/deduplicate results - Add /api/v1/resources HTTP endpoint to expose resource attributes - Add proto definitions for ResourceAttributesRequest/Response messages - Update Distributor interface and mock implementations for tests The endpoint accepts match[] selectors and optional start/end time range, returning series labels with their versioned resource attributes including identifying attributes, descriptive attributes, and entity information. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Add resourceQuerierCache to wrap queriers and implement the storage.ResourceQuerier interface required by the info() PromQL function. The cache lazily fetches resource attributes on first access and provides point-in-time lookups via GetResourceAt(). Both distributorQuerier and blocksStoreQuerier are now wrapped to support info() queries. Key changes: - Add resource_querier_cache.go with ResourceAttributesFetcher interface - Add distributorResourceFetcher for ingester queries - Add blocksResourceFetcher for store-gateway queries - Wrap queriers in NewResourceQuerierCache in Querier() methods - Add unit tests for cache functionality
Add a demo script to showcase how Mimir persists OTel resource attributes from OTLP metrics and makes them queryable via /api/v1/resources endpoint. The demo demonstrates: - OTLP metric ingestion with resource attributes - Querying attributes from ingester head (in-memory) - Flushing to blocks with series_metadata.parquet - Querying attributes from store-gateways - Versioned attributes tracking infrastructure changes Changes: - Enable otel_persist_resource_attributes in runtime.yaml - Add scripts/otlp-resource-attrs-demo.sh demo script - Add scripts/README.md documentation - Update development/README.md with demo instructions
The ResourceAttributes query was returning empty results because it used labels.Hash() to look up stored resource attributes, but TSDB Head.SeriesMetadata stores them using labels.StableHash(). These hash functions produce different values for the same labels, causing the lookup to fail. Also registers the /api/v1/resources endpoint in the API and adds demo tenant config for testing resource attributes persistence.
The resource attributes handler now queries both the ingester (for recent data in TSDB head) and the store-gateway (for data flushed to blocks), allowing resource attributes to remain queryable after data ages out of the ingester. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates the OTLP resource attributes demo to match the Prometheus documentation example at documentation/examples/otlp-resource-attributes. Changes: - Add entity_refs to OTLP payloads (service + host entities) - Add Phase 6: info() function demonstration with time-varying attributes - Add Phase 7: full API response format display - Update phase numbering to 8 phases matching Prometheus - Add query_promql() helper for PromQL instant queries - Update README with entity_refs documentation and architecture diagram Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
…ce attributes
This commit fixes several issues preventing the info() PromQL function from
enriching query results with OTel resource attributes (service.name, host.name,
cloud.region, etc.):
1. Hash consistency: Use labels.StableHash consistently across the distributed
query path (distributor, store-gateway, querier cache) to match the hash
used when storing resource attributes in the ingester.
2. Context propagation: Add stored context mechanism to resourceQuerierCache
to capture the tenant ID from Select() calls, since the ResourceQuerier
interface's GetResourceAt() method doesn't accept a context parameter.
3. Interface delegation: Implement ResourceQuerier interface on all wrapper
types in the query path (errorTranslateQuerier, multiQuerier, LazyQuerier)
to enable the interface to propagate through the wrapper chain.
4. Matcher fix: Use an "all series" matcher ({__name__!=""}) in cache fetchers
instead of nil matchers, as nil matchers cause getPostings to return empty
results.
5. Config: Enable otel_persist_resource_attributes in monolithic mode config.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for uploading series_metadata.parquet files (containing metric metadata and OTel resource attributes) to object storage during block upload. Changes: - Add SeriesMetadataFilename constant for the parquet filename - Add FileTypeSeriesMetadata for error reporting - Update GatherFileStats() to include series_metadata.parquet in stats - Update Upload() to upload series_metadata.parquet (when present) - Update tests with correct expected file counts This enables the store-gateway to serve resource attributes from persisted blocks, allowing the info() PromQL function to work for historical data (not just TSDB head data). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Reorder imports to comply with gci linter requirements: - Standard library imports first - Third-party imports (sorted alphabetically) - Internal imports (github.com/grafana/mimir) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the disabled info.test to match the upstream Prometheus version which now uses resource attributes instead of info metric joining. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
b8adc97 to
6a6c21e
Compare
The import was accidentally changed from gopkg.in/yaml.v3 to go.yaml.in/yaml/v3 which broke the yaml replace directive in go.mod, causing mimirtool rules check to fail. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the expected diff file to include the ResourceAttributes code that was manually added to mimir.pb.go. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Contributor
|
💻 Deploy preview available (WIP: feature: OTel resource attributes ingested as native metadata): |
Add UpdateResource call to TSDBBuilder.PushToStorageAndReleaseRequest to persist OTel resource attributes when processing timeseries from Kafka. This ensures resource attributes from OTLP ingestion are written to the series_metadata.parquet file during block compaction. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a demo script for the ingest storage architecture that showcases OTel resource attribute persistence. This mirrors the monolithic mode demo but adapts for the Kafka -> Block Builder flow: - Enable otel_persist_resource_attributes in ingest storage runtime config - Create demo script that waits for block builder instead of flushing - Add X-Scope-OrgID headers for multi-tenant gateway compatibility - Update messaging to explain ingest storage architecture Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Allow the OTel resource attributes demo to run fully non-interactively by adding optional flags to manage the docker-compose stack lifecycle. --start-stack: Starts the stack, waits for services to be ready, and handles the nginx/grafana port conflict by starting nginx separately. --stop-stack: Registers a cleanup trap to stop the stack on exit, including when the script fails or is interrupted. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Port the stack management flags from the ingest storage demo to the monolithic mode demo script. This enables fully automated demo runs. Also fix macOS compatibility by replacing `head -n -1` with `sed '$d'`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Working on support for OTel resource attributes ingested and persisted as native metadata.
Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]. If changelog entry is not needed, please add thechangelog-not-neededlabel to the PR.about-versioning.mdupdated with experimental features.