|
| 1 | +# V8 Crash Retry Solution for CI |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +CI jobs occasionally fail with a transient V8 bytecode deserialization crash during the Node.js setup phase. The error manifests as: |
| 6 | + |
| 7 | +``` |
| 8 | +Fatal error in , line 0 |
| 9 | +Check failed: ReadSingleBytecodeData( source_.Get(), SlotAccessorForHandle<IsolateT>(&ret, isolate())) == 1. |
| 10 | +``` |
| 11 | + |
| 12 | +This error occurs during the `yarn cache dir` command execution within the `actions/setup-node@v4` action. |
| 13 | + |
| 14 | +## Root Cause |
| 15 | + |
| 16 | +This is a known bug in Node.js/V8 that occurs sporadically: |
| 17 | + |
| 18 | +- **Node.js Issue**: https://github.com/nodejs/node/issues/56010 |
| 19 | +- **Setup-node Issue**: https://github.com/actions/setup-node/issues/1028 |
| 20 | + |
| 21 | +The crash happens when V8 attempts to deserialize cached bytecode and encounters corrupted or incompatible data. It's a transient issue that typically resolves on retry. |
| 22 | + |
| 23 | +## Previous Workarounds |
| 24 | + |
| 25 | +Before this fix, the codebase used two workarounds: |
| 26 | + |
| 27 | +1. **Completely disable yarn caching** in `examples.yml`: |
| 28 | + |
| 29 | + ```yaml |
| 30 | + # TODO: Re-enable yarn caching once Node.js V8 cache crash is fixed |
| 31 | + # Tracking: https://github.com/actions/setup-node/issues/1028 |
| 32 | + # cache: yarn |
| 33 | + # cache-dependency-path: '**/yarn.lock' |
| 34 | + ``` |
| 35 | + |
| 36 | +2. **Conditionally disable caching for Node 22** in `integration-tests.yml`: |
| 37 | + ```yaml |
| 38 | + cache: ${{ matrix.node-version != '22' && 'yarn' || '' }} |
| 39 | + ``` |
| 40 | +
|
| 41 | +Both workarounds significantly slowed down CI by preventing yarn dependency caching. |
| 42 | +
|
| 43 | +## Solution |
| 44 | +
|
| 45 | +Created a custom composite GitHub action at `.github/actions/setup-node-with-retry/` that: |
| 46 | + |
| 47 | +### Key Features |
| 48 | + |
| 49 | +1. **Pre-validation**: Tests `yarn cache dir` works before running `setup-node` |
| 50 | +2. **Automatic retry**: Retries up to 3 times when V8 crashes are detected |
| 51 | +3. **Smart error detection**: Only retries on V8 crashes, fails fast on other errors |
| 52 | +4. **Clear diagnostics**: Provides warning annotations in CI logs |
| 53 | +5. **Configurable**: Allows customizing max retries (defaults to 3) |
| 54 | +6. **Backward compatible**: Drop-in replacement for `actions/setup-node@v4` |
| 55 | + |
| 56 | +### How It Works |
| 57 | + |
| 58 | +```yaml |
| 59 | +- name: Setup Node.js with retry |
| 60 | + shell: bash |
| 61 | + run: | |
| 62 | + # Pre-validate yarn cache dir works |
| 63 | + if timeout 30 yarn cache dir > "$TEMP_OUTPUT" 2>&1; then |
| 64 | + echo "Yarn cache dir command succeeded" |
| 65 | + else |
| 66 | + # Check for V8 crash signature |
| 67 | + if grep -q "Fatal error in.*Check failed: ReadSingleBytecodeData" "$TEMP_OUTPUT"; then |
| 68 | + echo "::warning::V8 bytecode deserialization error detected" |
| 69 | + # Retry logic... |
| 70 | + fi |
| 71 | + fi |
| 72 | +
|
| 73 | +- name: Actually setup Node.js |
| 74 | + uses: actions/setup-node@v4 |
| 75 | + # ... standard setup-node configuration |
| 76 | +``` |
| 77 | + |
| 78 | +### Usage |
| 79 | + |
| 80 | +```yaml |
| 81 | +- name: Setup Node |
| 82 | + uses: ./.github/actions/setup-node-with-retry |
| 83 | + with: |
| 84 | + node-version: 22 |
| 85 | + cache: yarn |
| 86 | + cache-dependency-path: '**/yarn.lock' |
| 87 | + max-retries: 3 # Optional, defaults to 3 |
| 88 | +``` |
| 89 | + |
| 90 | +## Changes Made |
| 91 | + |
| 92 | +Updated all 8 CI workflow files to use the new action: |
| 93 | + |
| 94 | +1. ✅ `examples.yml` - **Re-enabled yarn caching** |
| 95 | +2. ✅ `integration-tests.yml` - **Re-enabled yarn caching for Node 22** |
| 96 | +3. ✅ `lint-js-and-ruby.yml` |
| 97 | +4. ✅ `package-js-tests.yml` |
| 98 | +5. ✅ `playwright.yml` |
| 99 | +6. ✅ `pro-integration-tests.yml` |
| 100 | +7. ✅ `pro-lint.yml` |
| 101 | +8. ✅ `pro-test-package-and-gem.yml` |
| 102 | + |
| 103 | +## Benefits |
| 104 | + |
| 105 | +1. **Improved reliability**: CI no longer fails due to transient V8 crashes |
| 106 | +2. **Better performance**: Yarn caching re-enabled across all workflows |
| 107 | +3. **Clear diagnostics**: Warning annotations show when retries occur |
| 108 | +4. **Maintainable**: Centralized retry logic in a reusable action |
| 109 | +5. **Future-proof**: Can be updated independently if V8 crash patterns change |
| 110 | + |
| 111 | +## Monitoring |
| 112 | + |
| 113 | +To verify the retry logic is working when V8 crashes occur: |
| 114 | + |
| 115 | +1. Watch CI logs for these warning messages: |
| 116 | + |
| 117 | + ``` |
| 118 | + ::warning::V8 bytecode deserialization error detected (attempt 1/3) |
| 119 | + Retrying in 5 seconds... |
| 120 | + ``` |
| 121 | + |
| 122 | +2. Check that jobs succeed after retry instead of failing |
| 123 | + |
| 124 | +3. If a job exhausts all retries, it will show: |
| 125 | + ``` |
| 126 | + ::error::All 3 retry attempts failed |
| 127 | + ``` |
| 128 | + |
| 129 | +## Implementation Details |
| 130 | + |
| 131 | +- **Timeout**: Each retry attempt has a 30-second timeout for `yarn cache dir` |
| 132 | +- **Retry delay**: 5 seconds between attempts to allow transient issues to clear |
| 133 | +- **Max retries**: Defaults to 3, configurable via input |
| 134 | +- **Error detection**: Regex pattern matches V8 crash signature in stderr/stdout |
| 135 | + |
| 136 | +## Future Improvements |
| 137 | + |
| 138 | +If the V8 crash persists even with retries, consider: |
| 139 | + |
| 140 | +1. Updating Node.js to a version with the fix (when available) |
| 141 | +2. Increasing max-retries for particularly flaky environments |
| 142 | +3. Adding exponential backoff between retries |
| 143 | +4. Implementing cache clearing before retry |
| 144 | + |
| 145 | +## Pull Request |
| 146 | + |
| 147 | +- **PR**: https://github.com/shakacode/react_on_rails/pull/2082 |
| 148 | +- **Branch**: `jg-/ci-retry-v8-crash` |
0 commit comments