Skip to content

Commit 745d09d

Browse files
Rename "Inconsistent TPU program" to "Fingerprint mismatch for HLO module".
PiperOrigin-RevId: 834022813
1 parent 28d611e commit 745d09d

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/guides/megascale_hang_playbook.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,12 @@ This message will often provide the potential cause of a hang. Please provide Go
2929

3030
## Common Issues
3131

32-
### 1. Inconsistent TPU Programs
32+
### 1. Fingerprint mismatch
3333

34-
Occasionally, different programs can run on TPU workers within the same system. This can lead to errors. Search your logs for a message like the following:
34+
Occasionally, an HLO module can be compiled differently across TPU workers within the same system. This can lead to errors. Search your logs for a message like the following:
3535

3636
```
37-
Megascale detects a hang that is likely caused by inconsistent TPU programs. This can be caused by some workers running with different JIT functions or a bug in the XLA compiler. Please inspect the HLO dumps to confirm the root cause.
37+
Megascale detects a hang that is likely caused by inconsistent HLO module compilation across workers. This can be caused by some workers running with different JIT functions or a bug in the XLA compiler. Please inspect the HLO dumps to confirm the root cause.
3838
3939
Example hosts that have different HLO fingerprints: ...
4040

0 commit comments

Comments
 (0)