A Python tool to patch Windows PE (Portable Executable) files to make MSVC builds reproducible by normalizing timestamps, GUIDs, and other non-deterministic debug metadata.
When compiling Windows executables with Microsoft Visual C++ (MSVC), even with the /Brepro flag enabled, builds are not fully reproducible. The same source code compiled twice produces different binaries due to non-deterministic debug information:
- COFF Header TimeDateStamp: Build timestamp in PE header
- Debug Directory Timestamps: 4 separate timestamps in debug entries (CODEVIEW, VC_FEATURE, POGO, REPRO)
- CODEVIEW GUID: Random GUID linking .exe to .pdb file
- CODEVIEW Age: Incremental counter that varies between builds
- REPRO Hash: Composite hash containing the GUID and timestamps
This makes binary verification in CI impossible - you can't verify that committed binaries match the source code because every rebuild produces different bytes, even though the executable code is identical.
This tool patches all non-deterministic fields in PE files to fixed, deterministic values:
- All timestamps →
0x00000001(January 1, 1970 + 1 second) - CODEVIEW GUID →
00000000-0000-0000-0000-000000000000 - CODEVIEW Age →
1 - REPRO Hash → All zeros
After patching, identical source code produces byte-for-byte identical binaries, enabling reproducible builds and CI verification.
- PE COFF Header TimeDateStamp (offset varies, typically 0xC0-0x100)
- Debug CODEVIEW Entry Timestamp
- Debug CODEVIEW GUID (16 bytes)
- Debug CODEVIEW Age (4 bytes)
- Debug VC_FEATURE Entry Timestamp
- Debug POGO Entry Timestamp
- Debug REPRO Entry Timestamp
- Debug REPRO Hash (36 bytes)
- All executable code (.text section)
- All program data (.data, .rdata sections)
- Import/Export tables
- Section headers
- Relocations
The binary behaves identically at runtime - only metadata used for debugging is normalized.
pip install msvcpp-normalize-pegit clone https://github.com/mithro/msvcpp-normalize-pe.git
cd msvcpp-normalize-pe
pip install .uv pip install msvcpp-normalize-peAfter installation, the msvcpp-normalize-pe command is available:
# Basic usage
msvcpp-normalize-pe program.exe
# Custom timestamp
msvcpp-normalize-pe program.exe 1234567890
# Verbose output
msvcpp-normalize-pe --verbose program.exe
# See all options
msvcpp-normalize-pe --helpYou can also use msvcpp-normalize-pe as a library in your Python code:
from pathlib import Path
from msvcpp_normalize_pe import patch_pe_file
result = patch_pe_file(Path("program.exe"), timestamp=1, verbose=True)
if result.success:
print(f"Patched {result.patches_applied} fields")
else:
print(f"Errors: {result.errors}")[1/1] COFF header: 0x829692a8 -> 0x00000001
[2/?] Debug CODEVIEW timestamp: 0x829692a8 -> 0x00000001
[3/?] Debug CODEVIEW GUID: e97b6ac706ea9b2dd577392d2bf08df7 -> 00000000000000000000000000000000
[4/?] Debug CODEVIEW Age: 7 -> 1
[5/?] Debug VC_FEATURE timestamp: 0x829692a8 -> 0x00000001
[6/?] Debug POGO timestamp: 0x829692a8 -> 0x00000001
[7/?] Debug REPRO timestamp: 0x829692a8 -> 0x00000001
[8/?] Debug REPRO hash: 20000000e97b6ac7... -> 000000000000000000...
Total: 8 timestamp(s) patched in program.exe
# Native MSVC builds
ifeq ($(USE_NATIVE_MSVC),1)
program.exe: program.cpp
cl.exe /O2 /Zi program.cpp /link /DEBUG:FULL /Brepro
msvcpp-normalize-pe program.exe 1
endifname: Verify Binary Reproducibility
jobs:
verify:
runs-on: windows-latest
steps:
- name: Build from source
run: |
cl.exe /O2 program.cpp /link /DEBUG:FULL /Brepro
msvcpp-normalize-pe program.exe 1
- name: Compare with committed binary
run: |
fc /b program.exe committed/program.exe- Python 3.9+ (type hints, dataclasses)
- Target files: Windows PE executables (.exe) or DLLs (.dll)
- Architecture: Works with both 32-bit (PE32) and 64-bit (PE32+) binaries
No runtime dependencies - uses only Python standard library (struct, sys, pathlib, dataclasses).
- ✅ Makes PE executables reproducible (timestamps, GUIDs)
- ✅ Works with native MSVC (cl.exe + link.exe)
- ✅ Preserves debugging capability (PDB files still work)
-
❌ PDB files remain non-deterministic (~11% of PDB content varies)
- PDB files contain thousands of small differences (padding, internal offsets, GUIDs)
- Microsoft's PDB format has fundamental non-determinism issues
- Industry solution: Use clang-cl + lld-link instead of native MSVC
-
❌ Does not work with stripped binaries (no debug directory to patch)
For fully reproducible builds including PDB files, use LLVM's Windows toolchain:
clang-cl /O2 /std:c++17 program.cpp /link /DEBUG:FULL /Brepro /TIMESTAMP:1The /TIMESTAMP: flag is only supported by lld-link, not native MSVC link.exe.
The tool parses the PE file structure to locate and patch:
- DOS Header (offset 0x3C) → PE signature offset
- PE Signature (offset varies) → Verify "PE\0\0"
- COFF Header (after PE sig) → TimeDateStamp at +4
- Optional Header (after COFF) → Contains Data Directories
- Data Directory #6 → Debug Directory (RVA + Size)
- Debug Directory Entries → 28-byte structures with timestamps
- CODEVIEW RSDS Structure → GUID at +4, Age at +20
- REPRO Hash → Full hash data
MSVC's /Brepro flag:
- ✅ Removes some non-determinism
- ✅ Uses hash-based timestamps instead of wall clock time
- ❌ Still produces different hashes for each build
- ❌ GUID remains random
- ❌ Age field increments
This is because /Brepro computes a hash of build inputs, but includes random/variable data in that hash.
ducible is an older tool with similar goals:
- ❌ Unmaintained (last update 2018)
- ❌ Only patches COFF header timestamp
- ❌ Does not patch Debug Directory timestamps
- ❌ Does not patch GUIDs or Age fields
Using LLVM's toolchain:
- ✅ Fully reproducible (including PDB files)
- ✅ Supports
/TIMESTAMP:flag - ❌ Not always possible (may need native MSVC for compatibility)
This tool fills the gap when you must use native MSVC but still want reproducible .exe files.
The non-determinism of MSVC builds with debug symbols is well-documented:
- Microsoft PDB Repository Issue #9: PDB non-determinism issues (GUIDs, padding, uninitialized buffers)
- Chromium Project: Uses clang-cl + lld-link specifically for reproducible builds
- Bazel Team: Marked
/experimental:deterministicas "not planned" because "PDBs are not deterministic" - Reproducible Builds Mailing List (Dec 2024): "there is no way to really solve this issue" with MSVC
- Stack Overflow (Nov 2024): "No complete solution currently exists for achieving fully reproducible MSVC builds with debug symbols"
Apache License 2.0 - See LICENSE file
Contributions welcome! Please test thoroughly with your build system before submitting PRs.
Developed as part of the ghidra-optimized-stdvector-decompiler project to enable CI verification of demo binaries compiled with multiple MSVC versions.