[storage] Add Elasticsearch data stream support#7768
[storage] Add Elasticsearch data stream support#7768SoumyaRaikwar wants to merge 14 commits intojaegertracing:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #7768 +/- ##
==========================================
- Coverage 95.48% 95.29% -0.19%
==========================================
Files 316 316
Lines 16756 16833 +77
==========================================
+ Hits 15999 16041 +42
- Misses 592 622 +30
- Partials 165 170 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c5f1ae9 to
1b79e3a
Compare
|
PR implements data stream templates aligned with @Manik2708's design doc:
Templates are ready for review. |
|
Thanks @SoumyaRaikwar! But I would like to suggest a little break. The design doc is not ready for implementation yet. Currently it lacks evidence, we might have to manually update the templates and then see how it behaves. Moreover we have to see whether we are missing something over backward compatibility. Currently it has my research from docs but I still have to test and provide the steps for reviewer. If you can help with this, then I would be very grateful to you. |
|
@Manik2708, I've reviewed your proposal and the current template implementation. I'm keeping my PR as draft for now. I can help with the manual testing you mentioned. I'll set up a local ES cluster, create the ingest pipelines, apply the data stream templates, and verify the behavior. I'll document the steps and results here for the design doc evidence. Let me know if there's a specific scenario you'd like me to prioritize |
|
I've completed initial testing and evidence gathering. Here's what I verified: Design Alignment Confirmed
Ingest Pipeline TestedCreated and tested
Manual Verification EvidencePosted test document to data stream:
Full evidence report: https://gist.github.com/SoumyaRaikwar/519a98bcc81dc2df04308ae4a66b702b Next StepsReady to work on:
Which should I prioritize? |
internal/storage/v1/elasticsearch/mappings/fixtures/jaeger-ds-span-8.json
Outdated
Show resolved
Hide resolved
internal/storage/v1/elasticsearch/mappings/jaeger-ds-dependencies-8.json
Outdated
Show resolved
Hide resolved
|
I've updated the design document and the Gist with the proposed Index Lifecycle Management (ILM) policy for data streams. Updates include:
Please review the updated proposal. Once approved, I can proceed with applying the ILM policy to the templates. |
ca6e88d to
3cdd9aa
Compare
Metrics Comparison Summary❌ ERROR: No summary files were generated. Expected at least 8 diff files from CI. This indicates a failure in the E2E test execution or metrics collection process. |
|
@yurishkuro i have updated PR to reflect changes as per doc, could you review? |
|
@SoumyaRaikwar you have embedded a lot of features into a single PR. I would suggest to just support Datastream in this PR that too only for span, we can work on these other features once datastream is enabled without any backward compatibility issue. |
0cefcaa to
7a26f99
Compare
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (5)
internal/storage/elasticsearch/config/config.go:808
- The IndexWithDate function has inconsistent behavior when the indexName already ends with a dash. On line 804-805, if the index name ends with "-", it appends the date directly (e.g., "jaeger-span-" + "2025-01-01" = "jaeger-span-2025-01-01"). However, on line 807, if it doesn't end with a dash, it adds one before the date (e.g., "jaeger-span" + "-" + "2025-01-01" = "jaeger-span-2025-01-01"). While both produce the same result for typical usage, this creates an inconsistency where the function behaves differently based on whether the caller includes a trailing dash. Consider standardizing this by always trimming trailing dashes and then adding one, or documenting this behavior clearly.
func IndexWithDate(indexName, dateLayout string, date time.Time) string {
if indexName == "" {
return date.UTC().Format(dateLayout)
}
if strings.HasSuffix(indexName, "-") {
return indexName + date.UTC().Format(dateLayout)
}
return indexName + "-" + date.UTC().Format(dateLayout)
}
internal/storage/v1/elasticsearch/spanstore/writer.go:101
- This code comment block (lines 92-99) discusses uncertainty about client readiness and version checking, with multiple questions that appear to be stream-of-consciousness thinking. This should either be removed or converted into a concise explanation of the actual design decision. The final comment on line 100-101 explains the actual behavior, making the preceding comments unnecessary.
// We can't check the version here because the client might not be ready.
// However, SpanWriter is lazy, so we can check it when we need it?
// Actually, NewSpanWriter is not lazy about creating ServiceOperationStorage.
// But p.Client is a factory function.
// Let's assume we can get a client instance here to check version?
// p.Client() creates a NEW client or returns existing?
// Looking at factory.go: f.getClient returns the stored client.
// We rely on factory to populate p.UseDataStream based on config or version detection.
useDataStream := p.UseDataStream
internal/storage/elasticsearch/config/config.go:794
- The validation logic prevents UseDataStream from being used with explicit span aliases, but doesn't validate against UseReadWriteAliases. According to the code in writer.go line 138, UseDataStream and UseReadWriteAliases are mutually exclusive for spans (the condition is
p.UseReadWriteAliases && !p.UseDataStream). Consider adding validation to prevent confusion: if UseDataStream is true, UseReadWriteAliases should probably also be validated or at least documented as incompatible.
// Data streams are used for spans, so explicit span aliases are incompatible
// with UseDataStream. Service aliases remain valid since services don't use data streams.
if c.UseDataStream && hasSpanAliases {
return errors.New("UseDataStream cannot be enabled together with explicit span aliases (span_read_alias, span_write_alias)")
}
internal/storage/v1/elasticsearch/spanstore/service_operation.go:68
- The opType variable is set to an empty string and then immediately passed to Add(). This appears to be intentional to maintain backward compatibility for service writes (which use upsert behavior with explicit IDs rather than create-only behavior). However, this would be clearer with a comment explaining why services don't use "create" opType like spans do, or using a named constant like
opTypeUpsert = ""to make the intent explicit.
il := s.client().Index().Index(indexName).Type(serviceType).BodyJson(service)
opType := ""
il.Id(cacheKey)
il.Add(opType)
test_output.txt:364
- These test output files appear to be accidentally committed and should not be part of the codebase. They show test failures and are likely temporary debugging artifacts. Please remove test_output.txt, test_output_v2.txt, and test_mappings_full.txt from the PR.
=== RUN TestMappingBuilderGetMapping
=== RUN TestMappingBuilderGetMapping/jaeger-span
mapping_test.go:89:
Error Trace: /media/soumya/DATA/open-source/jaeger/internal/storage/v1/elasticsearch/mappings/mapping_test.go:89
Error: Not equal:
expected: "{\n \"index_patterns\": [\n \"test-jaeger-span*\"\n ],\n \"priority\": 500,\n \"template\": {\n \"settings\": {\n \"index\": {\n \"number_of_shards\": 3,\n \"number_of_replicas\": 3,\n \"mapping\": {\n \"total_fields\": {\n \"limit\": 3000\n },\n \"nested_objects\": {\n \"limit\": 50\n }\n },\n \"default_pipeline\": \"jaeger-trace-time-to-timestamp\",\n \"lifecycle\": {\n \"name\": \"jaeger-test-policy\",\n \"rollover_alias\": \"test-jaeger-span-write\"\n }\n }\n },\n \"mappings\": {\n \"dynamic_templates\": [\n {\n \"span_tags_map\": {\n \"path_match\": \"tag.*\",\n \"mapping\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n },\n {\n \"process_tags_map\": {\n \"path_match\": \"process.tag.*\",\n \"mapping\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n ],\n \"properties\": {\n \"traceID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"spanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"operationName\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"parentSpanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"startTime\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"startTimeMillis\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"duration\": {\n \"type\": \"long\"\n },\n \"flags\": {\n \"type\": \"integer\"\n },\n \"logs\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"timestamp\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"fields\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n },\n \"process\": {\n \"properties\": {\n \"serviceName\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"tag\": {\n \"type\": \"object\"\n },\n \"tags\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n },\n \"references\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"refType\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"traceID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"spanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n },\n \"tag\": {\n \"type\": \"object\"\n },\n \"tags\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n }\n }\n}"
actual : "{\n \"index_patterns\": [\"test-jaeger-span*\"],\n \"priority\": 500,\n \"template\": {\n \"settings\": {\n \"index\": {\n \"number_of_shards\": 3,\n \"number_of_replicas\": 3,\n \"mapping\": {\n \"total_fields\": {\n \"limit\": 3000\n },\n \"nested_objects\": {\n \"limit\": 50\n }\n },\n \"default_pipeline\": \"jaeger-trace-time-to-timestamp\",\n \"lifecycle\": {\n \"name\": \"jaeger-test-policy\",\n \"rollover_alias\": \"test-jaeger-span-write\"\n }\n }\n },\n \"mappings\": {\n \"dynamic_templates\": [\n {\n \"span_tags_map\": {\n \"path_match\": \"tag.*\",\n \"mapping\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n },\n {\n \"process_tags_map\": {\n \"path_match\": \"process.tag.*\",\n \"mapping\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n ],\n \"properties\": {\n \"traceID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"spanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"operationName\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"parentSpanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"startTime\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"startTimeMillis\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"duration\": {\n \"type\": \"long\"\n },\n \"flags\": {\n \"type\": \"integer\"\n },\n \"logs\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"timestamp\": {\n \"type\": \"date\",\n \"format\": \"epoch_millis\"\n },\n \"fields\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n },\n \"process\": {\n \"properties\": {\n \"serviceName\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"tag\": {\n \"type\": \"object\"\n },\n \"tags\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n },\n \"references\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"refType\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"traceID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"spanID\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n },\n \"tag\": {\n \"type\": \"object\"\n },\n \"tags\": {\n \"type\": \"nested\",\n \"dynamic\": false,\n \"properties\": {\n \"key\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"value\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n },\n \"type\": {\n \"type\": \"keyword\",\n \"ignore_above\": 256\n }\n }\n }\n }\n }\n }\n}\n"
Diff:
--- Expected
+++ Actual
@@ -1,177 +1,176 @@
{
- "index_patterns": [
- "test-jaeger-span*"
- ],
- "priority": 500,
- "template": {
- "settings": {
- "index": {
- "number_of_shards": 3,
- "number_of_replicas": 3,
- "mapping": {
- "total_fields": {
- "limit": 3000
- },
- "nested_objects": {
- "limit": 50
- }
+ "index_patterns": ["test-jaeger-span*"],
+ "priority": 500,
+ "template": {
+ "settings": {
+ "index": {
+ "number_of_shards": 3,
+ "number_of_replicas": 3,
+ "mapping": {
+ "total_fields": {
+ "limit": 3000
+ },
+ "nested_objects": {
+ "limit": 50
+ }
+ },
+ "default_pipeline": "jaeger-trace-time-to-timestamp",
+ "lifecycle": {
+ "name": "jaeger-test-policy",
+ "rollover_alias": "test-jaeger-span-write"
+ }
+ }
+ },
+ "mappings": {
+ "dynamic_templates": [
+ {
+ "span_tags_map": {
+ "path_match": "tag.*",
+ "mapping": {
+ "type": "keyword",
+ "ignore_above": 256
+ }
+ }
+ },
+ {
+ "process_tags_map": {
+ "path_match": "process.tag.*",
+ "mapping": {
+ "type": "keyword",
+ "ignore_above": 256
+ }
+ }
+ }
+ ],
+ "properties": {
+ "traceID": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "spanID": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "operationName": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "parentSpanID": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "startTime": {
+ "type": "date",
+ "format": "epoch_millis"
+ },
+ "startTimeMillis": {
+ "type": "date",
+ "format": "epoch_millis"
+ },
+ "duration": {
+ "type": "long"
+ },
+ "flags": {
+ "type": "integer"
+ },
+ "logs": {
+ "type": "nested",
+ "dynamic": false,
+ "properties": {
+ "timestamp": {
+ "type": "date",
+ "format": "epoch_millis"
+ },
+ "fields": {
+ "type": "nested",
+ "dynamic": false,
+ "properties": {
+ "key": {
+ "type": "keyword",
+ "ignore_above": 256
},
- "default_pipeline": "jaeger-trace-time-to-timestamp",
- "lifecycle": {
- "name": "jaeger-test-policy",
- "rollover_alias": "test-jaeger-span-write"
+ "value": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "type": {
+ "type": "keyword",
+ "ignore_above": 256
}
+ }
}
+ }
},
- "mappings": {
- "dynamic_templates": [
- {
- "span_tags_map": {
- "path_match": "tag.*",
- "mapping": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
+ "process": {
+ "properties": {
+ "serviceName": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "tag": {
+ "type": "object"
+ },
+ "tags": {
+ "type": "nested",
+ "dynamic": false,
+ "properties": {
+ "key": {
+ "type": "keyword",
+ "ignore_above": 256
},
- {
- "process_tags_map": {
- "path_match": "process.tag.*",
- "mapping": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
+ "value": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "type": {
+ "type": "keyword",
+ "ignore_above": 256
}
- ],
- "properties": {
- "traceID": {
- "type": "keyword",
- "ignore_above": 256
- },
- "spanID": {
- "type": "keyword",
- "ignore_above": 256
- },
- "operationName": {
- "type": "keyword",
- "ignore_above": 256
- },
- "parentSpanID": {
- "type": "keyword",
- "ignore_above": 256
- },
- "startTime": {
- "type": "date",
- "format": "epoch_millis"
- },
- "startTimeMillis": {
- "type": "date",
- "format": "epoch_millis"
- },
- "duration": {
- "type": "long"
- },
- "flags": {
- "type": "integer"
- },
- "logs": {
- "type": "nested",
- "dynamic": false,
- "properties": {
- "timestamp": {
- "type": "date",
- "format": "epoch_millis"
- },
- "fields": {
- "type": "nested",
- "dynamic": false,
- "properties": {
- "key": {
- "type": "keyword",
- "ignore_above": 256
- },
- "value": {
- "type": "keyword",
- "ignore_above": 256
- },
- "type": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
- }
- }
- },
- "process": {
- "properties": {
- "serviceName": {
- "type": "keyword",
- "ignore_above": 256
- },
- "tag": {
- "type": "object"
- },
- "tags": {
- "type": "nested",
- "dynamic": false,
- "properties": {
- "key": {
- "type": "keyword",
- "ignore_above": 256
- },
- "value": {
- "type": "keyword",
- "ignore_above": 256
- },
- "type": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
- }
- }
- },
- "references": {
- "type": "nested",
- "dynamic": false,
- "properties": {
- "refType": {
- "type": "keyword",
- "ignore_above": 256
- },
- "traceID": {
- "type": "keyword",
- "ignore_above": 256
- },
- "spanID": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
- },
- "tag": {
- "type": "object"
- },
- "tags": {
- "type": "nested",
- "dynamic": false,
- "properties": {
- "key": {
- "type": "keyword",
- "ignore_above": 256
- },
- "value": {
- "type": "keyword",
- "ignore_above": 256
- },
- "type": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
- }
+ }
}
+ }
+ },
+ "references": {
+ "type": "nested",
+ "dynamic": false,
+ "properties": {
+ "refType": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "traceID": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "spanID": {
+ "type": "keyword",
+ "ignore_above": 256
+ }
+ }
+ },
+ "tag": {
+ "type": "object"
+ },
+ "tags": {
+ "type": "nested",
+ "dynamic": false,
+ "properties": {
+ "key": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "value": {
+ "type": "keyword",
+ "ignore_above": 256
+ },
+ "type": {
+ "type": "keyword",
+ "ignore_above": 256
+ }
+ }
}
+ }
}
+ }
}
+
Test: TestMappingBuilderGetMapping/jaeger-span
=== RUN TestMappingBuilderGetMapping/jaeger-span#01
=== RUN TestMappingBuilderGetMapping/jaeger-span#02
--- FAIL: TestMappingBuilderGetMapping (0.00s)
--- FAIL: TestMappingBuilderGetMapping/jaeger-span (0.00s)
--- PASS: TestMappingBuilderGetMapping/jaeger-span#01 (0.00s)
--- PASS: TestMappingBuilderGetMapping/jaeger-span#02 (0.00s)
FAIL
FAIL github.com/jaegertracing/jaeger/internal/storage/v1/elasticsearch/mappings 0.008s
FAIL
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Populate IngestPipelineName in factory.go - Update TestSpanReaderIndices expectation for default data stream behavior Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
f4402fa to
b2105f2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (4)
internal/storage/v1/elasticsearch/factory.go:190
- The default ingest pipeline name here ("jaeger-trace-to-timestamp") doesn’t match the name used in the mapping fixtures/tests ("jaeger-trace-time-to-timestamp") and the PR description. If the template renders
default_pipelinewith this value, indexing will fail unless a pipeline with the exact name exists. Consider aligning the default pipeline name (and any docs/scripts) to a single canonical value.
const (
defaultILMPolicyName = "jaeger-ilm-policy"
defaultIngestPipelineName = "jaeger-trace-to-timestamp"
)
internal/storage/elasticsearch/config/config.go:795
UseDataStreamis documented as requiring ES 7.9+/OpenSearch 2.0+ (and the PR description mentions validation), butValidate()currently only checks alias incompatibility. Consider adding a validation that rejectsUseDataStream=truewhen an explicitVersionis configured below the minimum supported version (and/or when the detected backend is known not to support data streams) to fail fast on unsupported setups.
// Data streams are used for spans, so explicit span aliases are incompatible
// with UseDataStream. Service aliases remain valid since services don't use data streams.
if c.UseDataStream && hasSpanAliases {
return errors.New("UseDataStream cannot be enabled together with explicit span aliases (span_read_alias, span_write_alias)")
}
internal/storage/v1/elasticsearch/spanstore/writer.go:101
- The block comment here reads like in-line development notes/questions (e.g., speculation about whether
p.Client()returns a new client) rather than stable code documentation. This makes the constructor harder to read and can become stale quickly. Consider replacing it with a concise explanation of the intended version/config responsibility (or removing it if it’s no longer needed).
// We can't check the version here because the client might not be ready.
// However, SpanWriter is lazy, so we can check it when we need it?
// Actually, NewSpanWriter is not lazy about creating ServiceOperationStorage.
// But p.Client is a factory function.
// Let's assume we can get a client instance here to check version?
// p.Client() creates a NEW client or returns existing?
// Looking at factory.go: f.getClient returns the stored client.
// We rely on factory to populate p.UseDataStream based on config or version detection.
useDataStream := p.UseDataStream
docs/adr/004-elasticsearch-data-streams.md:90
- This ADR’s ingest pipeline example copies
startTimeinto@timestamp, but Jaeger’s ES span model usesstartTimeas microseconds andstartTimeMillisas the millis/date field. CopyingstartTimedirectly would yield incorrect@timestampvalues unless the pipeline converts units. Consider updating the example to copy fromstartTimeMillis(or to include a script/convert processor) to reflect the actual stored fields.
### Handling @timestamp
Data Streams require a `@timestamp` field. An ingest pipeline copies `startTime` to `@timestamp`:
```json
{
"description": "Copy startTime to @timestamp for Data Stream compatibility",
"processors": [
{ "set": { "field": "@timestamp", "copy_from": "startTime" } }
]
}
</details>
---
💡 <a href="/jaegertracing/jaeger/new/main/.github/instructions?filename=*.instructions.md" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Add Copilot custom instructions</a> for smarter, more guided reviews. <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Learn how to get started</a>.
- Fix service index logic in data stream mode to support aliases. - Update index templates to use long for timestamps to match Jaeger model. Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Corrected startTime precision to long in ES8 mapping templates. Fixed flaky anonymizer tests by using a robust directory trigger instead of chmod. Verified backward compatibility for legacy index trace lookups. Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
|
@yurishkuro Regarding your concerns about backward compatibility, here is a detailed breakdown of how the Data Stream implementation is designed to be completely backward compatible:
In summary, for users who do not enable the feature, there is no change. For users who do enable it, the transition is seamless as Jaeger reads from both old and new data sources. (Note: I've also reverted the unrelated |
511b1c6 to
ef062c9
Compare
ef062c9 to
9303539
Compare
Description
This PR implements Elasticsearch Data Stream support specifically for Spans, following the design in ADR-004.
Part of #4708
Key Changes
Data Stream Implementation
jaeger-span-ds).IndexService.Add()interface toAdd(opType string). This allows the writer to explicitly setop_type=create, which is required for Data Streams.client.go,mocks,samplingstore, anddepstore.rollover_alias,index.requests.cache). Addsjaeger-span-8-ds.jsonfixture for testing.Configuration & cleanup
UseDataStreamflag with validation logic.config.IndexWithDateand removes unused code indepstore.Deferred to Follow-up PRs
UseISM/EnableLogsDBconfigsVerification
spanstore,config, andmappingstests pass.op_type=createwhenUseDataStream=true.