-
Notifications
You must be signed in to change notification settings - Fork 703
Fixed uuid first bit flipping issue. #3105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical UUID data corruption bug where UUIDs with leading '00' bytes were being incorrectly converted to '80' during string conversion. The root cause was DuckDB's internal storage format flipping the top bit of UUIDs for ordering consistency, which wasn't being reversed during conversion to PostgreSQL format.
Key changes:
- Added bit-flip reversal logic in UUID-to-string conversion to correctly restore original UUID values
- Added comprehensive test coverage for UUID edge cases (0x00, 0x0F, 0x10, 0x80, 0xFF byte prefixes)
- Updated format specifiers from
%xto%lxfor correct handling of 64-bit values
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
cpp/deeplake_pg/duckdb_pg_convert.cpp |
Implements bit-flip reversal and corrects format specifiers in UUID conversion logic |
postgres/tests/py_tests/test_uuid_corruption.py |
Adds comprehensive test suite validating UUID integrity across edge cases and NULL handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "%08lx-%04lx-%04lx-%04lx-%012lx", | ||
| (unsigned long)((upper_flipped >> 32) & 0xFFFFFFFFUL), | ||
| (unsigned long)((upper_flipped >> 16) & 0xFFFFUL), | ||
| (unsigned long)(upper_flipped & 0xFFFFUL), | ||
| (unsigned long)((uuid_val.lower >> 48) & 0xFFFFUL), | ||
| (unsigned long)(uuid_val.lower & 0xFFFFFFFFFFFFUL)); |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format specifiers use %lx which expects unsigned long, but the size of unsigned long is platform-dependent (32-bit on Windows, 64-bit on most Unix systems). Since we're working with explicit 64-bit values from uint64_t, consider using %016llx format specifier with unsigned long long casts, or use PRIx64 from <inttypes.h> for guaranteed portability across platforms.
| "%08lx-%04lx-%04lx-%04lx-%012lx", | |
| (unsigned long)((upper_flipped >> 32) & 0xFFFFFFFFUL), | |
| (unsigned long)((upper_flipped >> 16) & 0xFFFFUL), | |
| (unsigned long)(upper_flipped & 0xFFFFUL), | |
| (unsigned long)((uuid_val.lower >> 48) & 0xFFFFUL), | |
| (unsigned long)(uuid_val.lower & 0xFFFFFFFFFFFFUL)); | |
| "%08llx-%04llx-%04llx-%04llx-%012llx", | |
| (unsigned long long)((upper_flipped >> 32) & 0xFFFFFFFFULL), | |
| (unsigned long long)((upper_flipped >> 16) & 0xFFFFULL), | |
| (unsigned long long)(upper_flipped & 0xFFFFULL), | |
| (unsigned long long)((uuid_val.lower >> 48) & 0xFFFFULL), | |
| (unsigned long long)(uuid_val.lower & 0xFFFFFFFFFFFFULL)); |



🚀 🚀 Pull Request
Impact
Description
Things to be aware of
Things to worry about
Additional Context