Skip to content

Commit 4c9a22f

Browse files
Add GitHub Action workflow for TPM corruption testing
This workflow provides a reproducible test case for the TPM corruption bug that occurs when filling the TPM with objects until storage exhaustion. The workflow: 1. Builds commit 1a7f7d7 (buggy version) 2. Runs corruption test by filling TPM with AES keys 3. Captures corrupted NVChip as artifact for analysis 4. Builds PR version of wolfPKCS11 5. Tests PR version against corrupted state This serves as a foundation for developing and testing the TPM corruption repair function. Co-Authored-By: andrew@wolfssl.com <andrew@wolfssl.com>
1 parent 09786e3 commit 4c9a22f

File tree

2 files changed

+588
-0
lines changed

2 files changed

+588
-0
lines changed
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# TPM Corruption Test Workflow
2+
3+
## Purpose
4+
5+
This GitHub Action workflow provides a reproducible test case for the TPM corruption bug that occurs when filling the TPM with objects until storage exhaustion. It serves as a foundation for developing and testing the TPM corruption repair function.
6+
7+
## What This Workflow Does
8+
9+
### Phase 1: Create Corrupted State (Old Commit)
10+
1. **Build Environment Setup**
11+
- Builds wolfSSL with required flags for PKCS#11 support
12+
- Builds and starts IBM Software TPM simulator (ibmswtpm2)
13+
- Builds wolfTPM with SWTPM support
14+
15+
2. **Build Old wolfPKCS11 (Buggy Version)**
16+
- Checks out commit `1a7f7d71b98dbffbfd4ad77f0c77c8c573a2c5d2`
17+
- Builds with TPM storage backend enabled (`WOLFPKCS11_TPM_STORE`)
18+
- Initializes token with user PIN
19+
20+
3. **Create Corruption**
21+
- Fills TPM with AES keys until storage exhaustion
22+
- This triggers the bug where metadata writes succeed but object writes fail
23+
- Results in corrupted TPM state where token appears uninitialized after restart
24+
25+
4. **Capture Corrupted State**
26+
- Stops TPM server to flush NVChip file to disk
27+
- Captures the corrupted NVChip file as a GitHub Actions artifact
28+
- Artifact is retained for 30 days for analysis
29+
30+
### Phase 2: Test PR Version Against Corrupted State
31+
1. **Restart TPM with Corrupted State**
32+
- Restarts TPM server with the corrupted NVChip
33+
- This preserves the corrupted state for testing
34+
35+
2. **Build PR Version**
36+
- Builds the PR version of wolfPKCS11 with same configuration
37+
- This version should contain fixes or repair functions
38+
39+
3. **Test Access to Corrupted State**
40+
- Attempts to initialize library with corrupted TPM state
41+
- Attempts to login (expected to fail with current PR versions)
42+
- Attempts to enumerate objects with C_FindObjects
43+
- Documents the failure mode for repair function development
44+
45+
## Expected Behavior
46+
47+
### With Buggy Version (Old Commit)
48+
- Successfully creates 60-64 AES keys before storage exhaustion
49+
- TPM NV storage expands from ~196 to ~620 bytes
50+
- Corruption occurs silently during storage exhaustion
51+
52+
### With PR Version (Current/Fixed)
53+
- **Without Repair Function**: Login fails with `CKR_USER_PIN_NOT_INITIALIZED` (0x00000102)
54+
- **With Repair Function**: Should detect corruption and repair the TPM state
55+
56+
## Artifacts
57+
58+
The workflow produces one artifact:
59+
60+
- **corrupted-nvchip**: The NVChip file containing the corrupted TPM state
61+
- Size: ~620 bytes
62+
- Retention: 30 days
63+
- Can be downloaded and used for local testing
64+
65+
## Usage
66+
67+
### Automatic Trigger
68+
The workflow runs automatically on:
69+
- Pull requests to any branch
70+
- Manual workflow dispatch
71+
72+
### Manual Trigger
73+
To run manually:
74+
1. Go to Actions tab in GitHub
75+
2. Select "TPM Corruption Test" workflow
76+
3. Click "Run workflow"
77+
4. Select branch to test
78+
79+
### Local Testing with Artifact
80+
To test locally with the corrupted NVChip:
81+
82+
```bash
83+
# Download the corrupted-nvchip artifact from GitHub Actions
84+
85+
# Stop any running TPM server
86+
pkill -f tpm_server
87+
88+
# Replace NVChip with corrupted version
89+
cd ibmswtpm2/src
90+
cp /path/to/corrupted_NVChip ./NVChip
91+
92+
# Start TPM server
93+
./tpm_server &
94+
95+
# Test your repair function
96+
cd wolfpkcs11
97+
./your_repair_test
98+
```
99+
100+
## Development Workflow
101+
102+
### For Repair Function Development
103+
1. Create PR with repair function implementation
104+
2. Workflow automatically runs and creates corrupted state
105+
3. PR version is tested against corrupted state
106+
4. Review test output to verify repair function works
107+
5. Download corrupted NVChip artifact for local debugging if needed
108+
109+
### Expected Test Results
110+
- **Before Repair Function**: Test should fail at C_Login with error 0x00000102
111+
- **After Repair Function**: Test should succeed or provide clear repair instructions
112+
113+
## Technical Details
114+
115+
### Build Configuration
116+
All builds use:
117+
- `--enable-singlethreaded`: Single-threaded mode
118+
- `--enable-wolftpm`: wolfTPM integration
119+
- `--disable-dh`: DH disabled (as per GitHub Actions workflow)
120+
- `CFLAGS="-DWOLFPKCS11_TPM_STORE"`: TPM storage backend
121+
122+
### Corruption Mechanism
123+
The bug occurs when:
124+
1. TPM NV storage is nearly full
125+
2. New object creation attempts to write metadata first
126+
3. Metadata write succeeds
127+
4. Object data write fails due to insufficient storage
128+
5. Metadata now points to non-existent object
129+
6. Token state becomes corrupted
130+
131+
### Test Programs
132+
The workflow creates two test programs:
133+
134+
1. **corruption_test.c**: Creates corrupted state by filling TPM
135+
2. **access_test.c**: Tests accessing corrupted state with PR version
136+
137+
Both programs are compiled inline during workflow execution.
138+
139+
## Troubleshooting
140+
141+
### Workflow Fails at Corruption Step
142+
- Check TPM server is running (look for "TPM command server listening" in logs)
143+
- Verify wolfTPM and wolfSSL built successfully
144+
- Check that token initialization succeeded
145+
146+
### Workflow Fails at Access Step
147+
- This is expected behavior without repair function
148+
- Check error code: 0x00000102 indicates corruption was successfully reproduced
149+
- Download NVChip artifact to verify corruption locally
150+
151+
### Artifact Not Created
152+
- Check that TPM server was stopped before artifact capture
153+
- Verify NVChip file exists in ibmswtpm2/src directory
154+
- Check workflow permissions for artifact upload
155+
156+
## Future Enhancements
157+
158+
1. **Repair Function Testing**: Once repair function is implemented, update access_test.c to call repair function
159+
2. **Multiple Corruption Scenarios**: Add tests for different object types (RSA keys, certificates)
160+
3. **Corruption Severity Levels**: Test different levels of corruption (partial vs complete)
161+
4. **Automated Repair Verification**: Add assertions to verify repair function restores all objects
162+
163+
## Related Files
164+
165+
- `.github/workflows/tpm-corruption-test.yml`: Main workflow file
166+
- `tpm_corruption_test.c`: Original local test program (in repository root)
167+
- `tpm_corruption_reproduction_report.md`: Detailed bug analysis and reproduction report
168+
169+
## References
170+
171+
- Original bug report: Commit 1a7f7d71b98dbffbfd4ad77f0c77c8c573a2c5d2
172+
- wolfTPM documentation: https://github.com/wolfSSL/wolfTPM
173+
- IBM Software TPM: https://github.com/kgoldman/ibmswtpm2

0 commit comments

Comments
 (0)