Skip to content

Enable hardware-accelerated CRC32C in Abseil and TCMalloc builds#342

Draft
kai-franz wants to merge 1 commit intoyugabyte:masterfrom
kai-franz:enable-abseil-crc32c-hw-acceleration
Draft

Enable hardware-accelerated CRC32C in Abseil and TCMalloc builds#342
kai-franz wants to merge 1 commit intoyugabyte:masterfrom
kai-franz:enable-abseil-crc32c-hw-acceleration

Conversation

@kai-franz
Copy link
Copy Markdown
Contributor

@kai-franz kai-franz commented Feb 21, 2026

Summary

  • Add -msse4.2 and -mpclmul compiler flags to the Abseil and TCMalloc build definitions on x86_64, enabling Abseil's hardware-accelerated CRC32C implementation.
  • Without these flags, Abseil's CRC32C code paths gated on __SSE4_2__ and __PCLMUL__ are compiled out entirely — TryNewCRC32AcceleratedX86ARMCombined() compiles to return 0 (a stub), and all CRC32C operations fall back to a slow software table-based implementation.
  • With these flags, Abseil uses SSE4.2 crc32 hardware instructions for small buffers and PCLMULQDQ carry-less multiply for fast large-buffer folding — matching or exceeding the performance of the existing crcutil library's 3-way-striped approach.
  • TCMalloc needs the same flags because it builds its own copy of Abseil source via Bazel local_repository.
  • No aarch64 changes needed: the existing GRAVITON_COMPILER_FLAGS (-march=armv8.2-a+...+crypto) already define __ARM_FEATURE_CRC32 and __ARM_FEATURE_CRYPTO, activating Abseil's ARM SIMD CRC path.

Test plan

  • Verify Abseil builds successfully on x86_64 with the new flags
  • Verify TCMalloc builds successfully on x86_64 with the new flags
  • Confirm the compiled libabsl.a / libabsl.so contains crc32q and pclmulqdq instructions (via objdump -d | grep -c pclmulqdq)
  • Confirm TryNewCRC32AcceleratedX86ARMCombined() no longer returns 0 in the compiled library
  • Verify aarch64 builds are unaffected (no flag changes on that architecture)

Made with Cursor

Abseil's CRC32C implementation supports SSE4.2 CRC32 instructions and
PCLMULQDQ carry-less multiply for fast large-buffer folding, but these
code paths are gated on compile-time checks for __SSE4_2__ and __PCLMUL__.
Without -msse4.2 and -mpclmul, the hardware acceleration is compiled out
and Abseil falls back to a software table-based CRC32C — significantly
slower than even the existing crcutil 3-way-striped SSE4.2 implementation.

Add -msse4.2 and -mpclmul to the Abseil and TCMalloc build definitions on
x86_64. TCMalloc needs the same flags because it builds its own copy of
Abseil from source via Bazel local_repository.

No aarch64 changes are needed: the existing GRAVITON_COMPILER_FLAGS
(-march=armv8.2-a+...+crypto) already define __ARM_FEATURE_CRC32 and
__ARM_FEATURE_CRYPTO, which activate Abseil's ARM SIMD CRC path.

Co-authored-by: Cursor <cursoragent@cursor.com>
@kai-franz kai-franz marked this pull request as draft February 21, 2026 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant