Skip to content

Conversation

@warrenzhu25
Copy link
Contributor

What changes were proposed in this pull request?

This adds detailed shuffle migration statistics to block manager decommissioning. A new
MigrationInfo/MigrationStat exposes counts and sizes for discovered, migrated, remaining,
and deleted shuffle blocks, and the executor shutdown path uses the new struct. A unit test
verifies the stats and size accounting.

Why are the changes needed?

Decommissioning currently exposes only a boolean for “all blocks migrated.” The additional
stats provide visibility into shuffle migration progress and help diagnose slow/failed
migrations.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

./build/sbt "testOnly org.apache.spark.storage.BlockManagerDecommissionUnitSuite"
(Note: the suite passed, but the sbt run ended with sql-api / Antlr4 / antlr4Generate exit code 1.)

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Codex (GPT-5)

This commit enhances the decommissioning process by adding detailed shuffle
migration tracking to the `MigrationInfo` case class.

The new `MigrationStat` case class tracks:
- `numBlocksLeft`: Number of shuffle blocks remaining to be migrated
- `totalMigratedSize`: Total size in bytes of migrated shuffle blocks
- `numMigratedBlock`: Number of shuffle blocks successfully migrated
- `totalBlocks`: Total number of shuffle blocks discovered
- `totalSize`: Total size of all shuffle blocks (migrated + remaining)
- `deletedBlocks`: Number of shuffle blocks deleted during migration

This provides comprehensive visibility into shuffle block migration progress
and helps with monitoring decommissioning efficiency.

A unit test with enhanced size verification is included to ensure all
statistics are calculated and reported correctly.

Change-Id: I21a8913343ca92e353243c5a34b144a2e43f828c
@github-actions
Copy link

github-actions bot commented Jan 7, 2026

JIRA Issue Information

=== Improvement SPARK-54928 ===
Summary: Add comprehensive shuffle migration statistics to MigrationInfo
Assignee: None
Status: Open
Affected: ["4.1.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the CORE label Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant