Skip to content

Conversation

ashm-dev
Copy link
Contributor

@ashm-dev ashm-dev commented Sep 23, 2025

This PR fixes a series of misaligned memory access errors in the JIT engine that were causing segmentation faults when running tests with AddressSanitizer (ASan) and UndefinedBehaviorSanitizer (UBSan) enabled.

The unsafe direct pointer casts in patch_32, patch_32r, patch_64, and patch_x86_64_32rx have been replaced with memcpy to ensure safe, alignment-agnostic memory operations.

@ashm-dev
Copy link
Contributor Author

Lol, when I tried to restart the tests with these changes, I now got a Segmentation Fault, now I'm starting a new bug.

Copy link
Contributor

@sergey-miryanov sergey-miryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@ashm-dev ashm-dev requested a review from sobolevn September 24, 2025 08:46
@ashm-dev
Copy link
Contributor Author

#139288
This problem occurs after fixes

Copy link
Contributor Author

@ashm-dev ashm-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@picnixz
Copy link
Member

picnixz commented Sep 24, 2025

I'll repeat my question as it's now hidden (I should have made a standalone comment in the first place):

Don't we also have other unaligned stores for AArch64?

@ashm-dev
Copy link
Contributor Author

I'll repeat my question as it's now hidden (I should have made a standalone comment in the first place):

Don't we also have other unaligned stores for AArch64?

The sanitizers aren't flagging any more alignment issues after my recent fixes, which suggests the code paths covered by the tests are now clean. Is there a specific spot in the AArch64 code you think I should look at?

@picnixz
Copy link
Member

picnixz commented Sep 24, 2025

The sanitizers aren't flagging any more alignment issues after my recent fixes, which suggests the code paths covered by the tests are now clean.

FTR, AArch64 patches are only used for AArch64 so be sure to test this on the corresponding architecture as well (in the reported issue, I see that the architecture is x86_64, but I don't know if you have access to an Apple Darwin with ARM64). Note that patch_32 is unused on an x86_64 architecture and is (I think) only used for Windows 32bit.

Is there a specific spot in the AArch64 code you think I should look at

Not really, just that all aarch64-based functions use set_bits and this one isn't used on x86_64 so we should also have someone check it (I don't have such machines).

@diegorusso
Copy link
Contributor

FTR, AArch64 patches are only used for AArch64 so be sure to test this on the corresponding architecture as well (in the reported issue, I see that the architecture is x86_64, but I don't know if you have access to an Apple Darwin with ARM64). Note that patch_32 is unused on an x86_64 architecture and is (I think) only used for Windows 32bit.

The Arm runners should cover Linux, Windows and macOS on Arm. They seem all to pass but also we are not running

            --with-address-sanitizer \
            --with-undefined-behavior-sanitizer

with the JIT. Maybe we should have some coverage for that?

Besides I'm pondering the cost of introducing memcpy for all build as they will be frequently executed.

@ashm-dev
Copy link
Contributor Author

ashm-dev commented Sep 29, 2025

@picnixz

my concern was more about Windows & MSVC which doesn't seem to optimize memcpy (or does it?)

Godbolt shows that MSVC also optimizes memcpy when the /O2 optimization flag is enabled.

P.S. The /O2 flag is the MSVC counterpart to -O3 in GCC/Clang.

@picnixz
Copy link
Member

picnixz commented Sep 29, 2025

P.S. The /O2 flag is the MSVC counterpart to -O3 in GCC/Clang.

Oh, that's why what I did had no effect. I used -O3 :') Ok. So, all compilers and supported archs seem to be optimized properly, except when hints are not given sometimes but that's only on an old architecture.

On my side, I would say it's fine. However, I'd like to know the opinion of JIT experts and maybe some benchmarks as well to see if this is true in practice. JIT+PGO would suffice (LTO takes some time...).

picnixz
picnixz previously approved these changes Sep 29, 2025
@ashm-dev
Copy link
Contributor Author

And since we accept these changes, this issue is already becoming relevant)

@picnixz
Copy link
Member

picnixz commented Sep 29, 2025

And since we accept these changes, this #139288 is already becoming relevant)

Oh right, I forgot about this one. However, this is annoying then. Should we first care about fixing the possible tests before actually fixing this specific UB? Because if we merge this PR, then we will have main in an even less stable state IMO.

Do you have a fix for that issue in particular? Because I would prefer not fixing the UB here rather than having a real crash on ASAN.

@picnixz picnixz dismissed their stale review September 29, 2025 09:49

I want to avoid merging a fix that can make ASAN fail.

@ashm-dev
Copy link
Contributor Author

Do you have a fix for that issue in particular?

Yes, so far the description of the problem is in the form of a hypothesis, now I'm looking for options to fix it.

A preceding test in test_asyncio is corrupting some internal state (e.g., memory layout, an object's type, or internal JIT caches). This corruption does not cause an immediate crash but leads to a fatal error later when the JIT compiler attempts to process code during the "victim" test.

@ashm-dev
Copy link
Contributor Author

@picnixz The JIT optimizer had a memory leak because it wasn't tracking dependencies on global and builtin namespaces when encountering LOAD_GLOBAL opcodes. This caused cached executors to become stale without proper invalidation when globals were modified. The fix adds explicit dependency tracking by registering frame->f_globals and frame->f_builtins in the dependencies bloom filter, resolving memory leaks in release builds. The issue still persists in debug builds and I'm currently working on fixing that.

@bedevere-app
Copy link

bedevere-app bot commented Sep 29, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@ashm-dev
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Sep 29, 2025

Thanks for making the requested changes!

@picnixz: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from picnixz September 29, 2025 16:37
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue still persists in debug builds and I'm currently working on fixing that.

If this is the case, then the fix should be instead complete. I don't want a fix that is only working for release builds. I'll try to review the complete fix later. Also, is it possible to have a reproducer that shows the leak without having to enable UBSan? UBs here shouldn't be critical I guess, but if there is a crash because of the memcpy() fixes, then the issue might already exist, in which case, I would like a reproducer and an independent fix.

Comment on lines +740 to +743
case LOAD_GLOBAL:
_Py_BloomFilter_Add(dependencies, frame->f_globals);
_Py_BloomFilter_Add(dependencies, frame->f_builtins);
_Py_FALLTHROUGH;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not well-versed enough here to be sure at first glance that this is correct. So I'd wait for Mark's input, or anyone else that can confirm this is correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong to me.

@ashm-dev why did you make this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release build was crashing with a segfault without this change. There’s a related issue linked in the PR, although it’s currently closed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't explain why you made this change. It seems unrelated to the issue.
This change is incorrect: LOAD_GLOBAL doesn't introduce a dependency on the globals or builtins. A trace containing LOAD_GLOBAL remains valid if either are modified.

@bedevere-app
Copy link

bedevere-app bot commented Sep 29, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants