Export OOM reduce the surrogateIdRangeSize and increase range count#5540
Draft
apurvabhaleMS wants to merge 6 commits intomainfrom
Draft
Export OOM reduce the surrogateIdRangeSize and increase range count#5540apurvabhaleMS wants to merge 6 commits intomainfrom
apurvabhaleMS wants to merge 6 commits intomainfrom
Conversation
| Assert.Equal(OperationStatus.Completed, _exportJobRecord.Status); | ||
|
|
||
| // Verify all 6 resources were written (2 per range * 3 ranges) | ||
| Assert.True(_exportJobRecord.Output.ContainsKey(KnownResourceTypes.Patient)); |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5540 +/- ##
==========================================
+ Coverage 77.02% 77.06% +0.03%
==========================================
Files 983 984 +1
Lines 36007 36110 +103
Branches 5469 5484 +15
==========================================
+ Hits 27736 27828 +92
- Misses 6927 6932 +5
- Partials 1344 1350 +6 🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR aims to reduce SQL export out-of-memory risk by splitting each export job’s surrogate ID range into many smaller sub-ranges, then processing those sub-ranges sequentially within a single processing job.
Changes:
- Split orchestrated SQL export ranges into smaller sub-ranges and attach them to
ExportJobRecord. - Update export execution to iterate sub-ranges sequentially, defer final file commit, and add memory diagnostics.
- Add supporting model/config changes and happy-path unit tests for sub-range handling.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
src/Microsoft.Health.Fhir.SqlServer/Features/Operations/Export/SqlExportOrchestratorJob.cs |
Splits larger export ranges into smaller batches and queues grouped processing records. |
src/Microsoft.Health.Fhir.Core/Features/Operations/JobRecordProperties.cs |
Adds serialized property name for sub-range metadata. |
src/Microsoft.Health.Fhir.Core/Features/Operations/Export/Models/ExportSurrogateIdRange.cs |
Introduces the serialized start/end range model. |
src/Microsoft.Health.Fhir.Core/Features/Operations/Export/Models/ExportJobRecord.cs |
Stores sub-range lists on export job records. |
src/Microsoft.Health.Fhir.Core/Features/Operations/Export/ExportProcessingJob.cs |
Refactors execution/result handling around export task invocation. |
src/Microsoft.Health.Fhir.Core/Features/Operations/Export/ExportJobTask.cs |
Processes multiple sub-ranges sequentially and changes commit/logging behavior. |
src/Microsoft.Health.Fhir.Core/Configs/ExportJobConfiguration.cs |
Adds the shared range-reduction constant. |
src/Microsoft.Health.Fhir.Core.UnitTests/Features/Operations/Export/ExportProcessingJobTests.cs |
Adds tests for passing sub-ranges through processing jobs. |
src/Microsoft.Health.Fhir.Core.UnitTests/Features/Operations/Export/ExportJobTaskTests.cs |
Adds happy-path tests for executing sub-ranges in one task. |
| } | ||
|
|
||
| ExportJobProgress progress = _exportJobRecord.Progress; | ||
| if (_exportJobRecord.SurrogateIdRanges != null && _exportJobRecord.SurrogateIdRanges.Count > 0) |
| // Override _count per sub-range to match the range size, ensuring SQL returns at most | ||
| // that many resources per call. This prevents OOM for large resources by capping | ||
| // how many rows are materialized in memory at once. | ||
| for (int i = 0; i < _exportJobRecord.SurrogateIdRanges.Count; i++) |
| rangeQueryParams.Add(Tuple.Create(KnownQueryParameterNames.EndSurrogateId, range.EndId)); | ||
| rangeQueryParams.Add(Tuple.Create(KnownQueryParameterNames.StartSurrogateId, range.StartId)); | ||
|
|
||
| _exportJobRecord.Progress = new ExportJobProgress(continuationToken: null, page: 0); |
| // how many rows are materialized in memory at once. | ||
| for (int i = 0; i < _exportJobRecord.SurrogateIdRanges.Count; i++) | ||
| { | ||
| var range = _exportJobRecord.SurrogateIdRanges[i]; |
Comment on lines
+250
to
+251
| rangeQueryParams.Add(Tuple.Create(KnownQueryParameterNames.EndSurrogateId, range.EndId)); | ||
| rangeQueryParams.Add(Tuple.Create(KnownQueryParameterNames.StartSurrogateId, range.StartId)); |
Comment on lines
+253
to
+254
| _exportJobRecord.Progress = new ExportJobProgress(continuationToken: null, page: 0); | ||
| ExportJobProgress rangeProgress = _exportJobRecord.Progress; |
Comment on lines
+237
to
+248
| var rangeQueryParams = new List<Tuple<string, string>>(queryParametersList); | ||
|
|
||
| // Replace the _count parameter with the per-range value | ||
| for (int p = 0; p < rangeQueryParams.Count; p++) | ||
| { | ||
| if (rangeQueryParams[p].Item1 == KnownQueryParameterNames.Count) | ||
| { | ||
| rangeQueryParams[p] = Tuple.Create(KnownQueryParameterNames.Count, rangeCount.ToString(CultureInfo.InvariantCulture)); | ||
| break; | ||
| } | ||
| } | ||
|
|
Comment on lines
+108
to
+109
| var rangeBatches = new List<List<(long StartId, long EndId, int Count)>>(); | ||
| var currentBatch = new List<(long StartId, long EndId, int Count)>(); |
Comment on lines
+117
to
+122
| currentBatch.Add(range); | ||
| if (currentBatch.Count >= ExportJobConfiguration.RangeReductionFactor) | ||
| { | ||
| rangeBatches.Add(currentBatch); | ||
| currentBatch = new List<(long StartId, long EndId, int Count)>(); | ||
| } |
Comment on lines
+130
to
+133
| foreach (var batch in rangeBatches) | ||
| { | ||
| _logger.LogJobInformation(jobInfo, "Creating export record (1)."); | ||
| var processingRecord = CreateExportRecord(record, jobInfo.GroupId, resourceType: type, startSurrogateId: range.StartId.ToString(), endSurrogateId: range.EndId.ToString(), globalStartSurrogateId: globalStartId.ToString(), globalEndSurrogateId: globalEndId.ToString()); | ||
| var processingRecord = CreateExportRecord( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Describe the changes in this PR.
Related issues
Addresses [issue #].
Testing
Describe how this change was tested.
FHIR Team Checklist
Semver Change (docs)
Patch|Skip|Feature|Breaking (reason)