-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
area:dataData import/export/migrationData import/export/migrationperformancePerformance improvementPerformance improvementpriority:highHigh priorityHigh priority
Description
Problem
ParallelExporter parallelizes across entities, not within a single entity. For single-entity exports (or multi-entity exports with one large table), it's purely sequential paging:
// src/PPDS.Migration/Export/ParallelExporter.cs:118
// Parallelism is at entity level only
await Parallel.ForEachAsync(schema.Entities, ...)Each entity is exported with sequential paging:
Page 1 (5000) → wait → Page 2 (5000) → wait → ... → Page N → done
Performance Comparison
| Tool | Time for 269K records | Throughput | Method |
|---|---|---|---|
| SQL4CDS | 26-36 sec | ~8,000 rec/s | 48-thread partitioned |
| PPDS | 2:43 (163 sec) | ~1,650 rec/s | Sequential paging |
PPDS is 5x slower than SQL4CDS for single-entity exports.
Root Cause
FetchXML paging requires cookies from previous pages, preventing simple parallelization of pages. But SQL4CDS demonstrates alternatives exist.
Possible Solutions
1. Range-based partitioning (like SQL4CDS)
Split by primary key ranges and fetch in parallel:
var ranges = await GetPrimaryKeyRanges(entityName, partitionCount);
await Parallel.ForEachAsync(ranges, async range => {
var fetchXml = BuildFetchXmlWithKeyRange(entity, range.Min, range.Max);
// Fetch all records in this range with sequential paging
});2. Offset paging
Get total count first, then parallel fetch by page offset:
var total = await GetTotalCount(entity);
var pageCount = (total + pageSize - 1) / pageSize;
await Parallel.ForEachAsync(Enumerable.Range(0, pageCount), async pageNum => {
var fetchXml = BuildFetchXmlWithOffset(entity, pageNum * pageSize, pageSize);
});3. Hybrid approach
Use sequential paging for small exports (<10K records) and parallel partitioning for large exports.
Files
src/PPDS.Migration/Export/ParallelExporter.cssrc/PPDS.Migration/Export/ExportOptions.cs(add partition settings)
Acceptance Criteria
- Single-entity export of 269K records should complete in <1 minute (vs current 2:43)
- Throughput should be >5,000 rec/s (vs current ~1,650 rec/s)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area:dataData import/export/migrationData import/export/migrationperformancePerformance improvementPerformance improvementpriority:highHigh priorityHigh priority
Projects
Status
Todo