-
Notifications
You must be signed in to change notification settings - Fork 601
sv_setsv_cow: only succeed if sv_setsv() would also COW #22120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
f078ccd to
b2a4d54
Compare
iabyn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks plausible. It would be nice for the sv_setsv_cow() fn to have a few lines of code comments at the top explaining what it's for/does.
|
@tonycoz this p.r. was filed back in April, but is still in draft status and has acquired merge conflicts. Assuming you want to move forward with it, could you resolve those conflicts and un-draft the ticket? Thanks. |
|
I'll rebase it, I'm looking into some behaviour I didn't expect with a variant of this change. |
|
Your performance problem is related to the ticket above, and HeapReAlloc() either refuses to shrink the memory allocation as a MS perf optimization, or all "freed" memory by HeapReAlloc() from large (any alloc too large for Low Fragmention Heap), is donated to the Low Fragmentation Heap. HeapReAlloc() does a I agree with logic, is SvCUR vs SvLEN gap is too big at some threshold, do not COW. Or shrink the PV buffer in the parent SV * (maybe a very bad perf idea for PAD vars). edit: very little can be found on this bug, and im using win7, but look at this 2011 post, nobody in that thread knows why the code exists, but someone in the mid 1990s knew the secret bug/problem but it was lost to time |
https://stackoverflow.com/questions/9164837/realloc-does-not-correctly-free-memory-in-windows Above shows the bug with Windows architecture. I found the correct explanation, NT Kernel has no function called The only way to shrink the reserved address space of kernel memory handle is to VirtualAlloc a new kernel memory object, then call Side trivia |
5717741 to
e531053
Compare
The behaviour I was concerned about was a leak if sv_setsv_cow() didn't cow copy an already COW SV, I eventually tracked this down to pp_subst, which seems to assume force_normal() will always free the buffer of a CoW SV: it doesn't, if the SV is the final owner the SV retains the buffer and CoW is turned off. Fixing that leak (freeing the memory in pp_subst) lead to a use after free in the regular expression engine that I'll be tracking down next. |
This will actually fail in some cases in the next commit.
Previously if you had a successful match against an SV with a
SvLEN() large relative to the SvCUR() the regexp engine would
use sv_setsv_cow() to make a COW copy of the matched SV,
extending the life of the large allocation buffer.
A normal sv_setsv() normally didn't do such a COW copy, but the above
also marked the source SV as COW, so further copies of the SV
could even further extend the lifetime of the buffer, eg:
while (<>) { # readline tends to make large SvLEN()
/something/; # some sort of match
push @save, $_; # with a successful match, the large $_ buffer
# survives until @save is released
}
Fixes part of Perl#21877
e531053 to
d185828
Compare
|
The problem I ran into was #20206, if sv_setsv_cow() didn't CoW copy a previously CoW SV the code from pp_subst would receive a CoW SV with only a single refcount, and that code doesn't correctly handle that case. Unfortunately fixing that results in use after free in code like: (this failed in a test, but I've lost of track of where the test was) which I believe is a known problem in some cases with the regexp engine not updating its pointers to match changes in the underlying SV. |
A max 255 COW SVPV with a refcount of 1 shouldn't ever exist in the first place. Whatever RC--ed RC # 2 down to RC # 1 on the PV buffer shouldve stripped the SV head flags saying this SVPV is a COW, and shouldve Someone correct me if this thinking is wrong. |
The problem is if you have two SVs pointing at the same PV, if sv1 releases its reference to the PV, sv1 gets marked as non-COW, but it has no way to mark sv2 as non-COW, so you're left with a COW PV with a reference count of 1. |
Can't some random location inside libperl |
How does the modification of sv1 find sv2? What triggers this Or are you thinking every COW SV is periodically checked to see if its PV has more than one reference? But there's no real need, S_sv_uncow() will just turn the flag off if something needs to modify the PV and the SV has the only reference to it. |
It can't and doesn't need to. When sv1 RC-- and discards the COW PVX, sv2 is on its own.
Natural runloop code. sv_catpvn()/etc. SVfIsCOW + SvLEN() +last byte==RC==0 is logically the same as
Probably. A no COW Newx() buf can always upgrade to a COW Newx() RC == 1 buf, and back down to a no COW Newx() buf at any time no side effects.
Adding a variant of |
This improved performance of the test case I was using from:
to
on cygwin.
The problem that is fixed here is that sv_setsv_cow() would COW SVs even though they had a very short string in a very large buffer, which had some sort of unclear to me interaction with the Win32 virtual memory system that had a painful performance impact when the large SV was made COW.
It may also improve memory handling on non-Win32 systems, since the SV was made COW, other SV copying functions wouldn't check if it was suitable for COW, so further copies would retain a COW reference for the large buffer. In the example test case from #21877 if we pushed the SV to an array after a successful match memory use would balloon out, running even a system with a generous amount of memory, out of memory.
Fixes part of #21877