Skip to content

Commit 48bda52

Browse files
bulk88mauke
authored andcommitted
fix CHostPerl's class design & remove CHostPerl malloc-ed fnc vtbls bloat
Note, non-standard used of "static link" below, I am using it to refer to static importing funtions/data symbols from another DLL, using the PE import table. Opposite of "LoadLibrary()"/"GetProcAddress()" linking. I am NOT using "static link" in typical usage of fully including a copy of a library at compile time, through a .a/.lib/.o/.obj file. Since commit Revision: af2f850 10/19/2015 5:47:16 PM const vtables in win32/perlhost.h the vtables have been stored in read-only memory. There have been no bug tickets or complaints since, of any users, wanting or depending on this runtime instrumentation system. All Win32 perl builds, are static DLL linked to a specific MSVCRT (LIBC) at interp C compile build time. No matter the name of the CRT DLL, msvcrt.dll, msvcrt120.dll, ucrtbase.dll, etc. Runtime swapping the libperl MSVCRT DLL by an embedder, to his favorite CRT DLL, has never been supported, and wouldn't even work, since perlhost.h's hooking isn't perfect, and often Win32 Perl uses "win32_*()" functions by accident, or explictly, and those static-link call into the hard coded CRTs. Plus prototypes of CRT posix-ish functions have changed over the years. What is time_t? stat_t? etc. While func symbol name stays the same. The original commit for all this complexity, was from 5.0/5.6 era, where it was assumed, perl 5 maint/stable releases will be abandoned by P5P in favor of Perl 6, and all this complexity were provisions and APIs, to fix, upgrade and improve Win32 Perl, on Microsoft's/ActiveState's rapid release schedule, without any dependency on P5P devs/pumpking/P5P policy, about releasing a new perl5 .tar.gz. 0f4eea8 6/19/1998 6:59:50 AM commoit title "applied patch, along with many changes:" "The features of this are: 1. All OS dependant code is in the Perl Host and not the Perl Core. (At least this is the holy grail goal of this work) 2. The Perl Host (see perl.h for description) can provide a new and improved interface to OS functionality if required. 3. Developers can easily hook into the OS calls for instrumentation or diagnostic purposes." None of these provisions and APIs, have ever been used. CPerlHost:: never became a separate DLL. Perl >= 5.12 has a "rapid release" policy. ActiveState dropped sponsorship/product interest in Win32 Perl, many years ago. Strawberry Perl took over the market. CPerlHost:: is way too over engineereed for perl's ithreads/psuedofork, which only requires "1 OS process" and 2 %ENVs, and 2 CWDs, functionality. Most of the CPerlHost::* methods are jump stubs to "win32_*();" anyways, and the hooking is redundant runtime overhead, but that is for another commit. This commit is about removing the pointless malloc() and memcpy() of the plain C to C++ "thunk funcs" vtables, from the RO const master copy in perl5**.dll to each "my_perl" instance at runtime. On x64, copying the vtables to malloc memory, wasted the following amounts of malloc() memory. These are the actual integers passed to malloc() by CPerlHost::/perl. malloc() secret internal headers not included in these numbers. perlMem, 0x38 perlMemShared, 0x38 perlMemParse, 0x38 perlEnv, 0x70 perlStdIO, 0x138 perlLIO, 0xE0 perlDir, 0x58 perlSock, 0x160 perlProc, 0x108 The old design of malloc-ed vtables, seems to have been, from the original devs not knowing, or a poor understanding, how MS COM (C++ obj in plain C) and MSVC ISO C++ objects (almost same ABI), are layed out in memory. The original devs realized, if they use a ptr to global vtable struct, they can't "cast" from the child class like VDir:: or VMem::, back to a CPerlHost:: obj which is a design requirement here. But they wanted to pass around child class ptrs like VMem::* instead of one CPerlHost:: obj ptr, and those VMem:: ptrs must be seen in 'extern "C"' land by plain C perl, since my_perl keeps 9 of these C++ obj *s as seperate ptrs in the my_perl "plain C" struct. So instead they made malloced copies of the vtables, and put those copies in the CPerlHost:: obj, so from a child class ptrs, they can C++ cast to the base class CPerlHost:: obj if needed. This is just wrong. Almost universally, vtables are stored in const RO memory. Monkey-patching at runtime is a Perl lang thing, and rare to never in C/C++land. The ptr to "plain C to C++ func thunk vtable", CAN NOT be stored per VDir::* or per VMem::* ptr. You can't store them, per C++ tradition, as the 1st member/field of a VDir::/VMem:: object. The reason is, VDir::/VMem:: objects can have refcounts, and multiple CPerlHost:: ptrs, hold refs to one VMem:: ptr. So there is no way to reverse a random VMem:: ptr, back to a CPerlHost:: ptr. Main examples are VMem:: "MemShared" and VMem:: "MemParse". Also the C->C++ thunk funcs must pick/separate, between 3 VMem:: obj ptrs. Which are "Mem", "MemShared" and "MemParse" and stored at different offsets in CPerlHost::*, but all 3 VMem:: derived "classes", must have the same plain-C vtable layout with 7 extern "C" func thunk ptrs. B/c my minimal C++ knowledge and also not wanting to add even more C++ classes to iperlsys.h perlhost.h and perllib.c, and those new C++ classes may or may not inline-away. Don't fix this with more C++ classes. So fix all of this, by each CPerlHost:: obj storing a ptr to the RO vtable instead of a huge RW inlined copy of the vtable. To keep all previous design requirements, just use "&cperlhost_obj->vmem_whatever_vtable" as the plain-C representation of a VMem::* ptr, instead of "&cperlhost_obj->IPerlWhateverMem.pMalloc". The 1 extra pointer de-ref CPU machine op, in each perl core and perl xs caller, that executes in "iperlsys.h" family of macros I think is irrelavent compared to the savings of having RO vtables. It is the same machine code length on x86/x64 in each caller, comparing old vs new. This extra ptr deref to reach the vtable can be removed, and I will probably do it in a future commit. Not done here for bisect/small patch reasons. "iperlsys.h" family of macros is for example, the macro "PerlEnv_getenv(str);" Specific example, for macro PerlMem_free() in Perl_safesysfree() old before this commit ---- mov rax, [rax+0CE8h] mov rcx, rax call qword ptr [rax+10h] ----- new after this commit ----- mov rcx, [rax+0CE8h] mov rax, [rcx] call qword ptr [rax+10h] ---- "mov rcx, rax" is "0x48 0x8B 0xC8" compared to "mov rax, [rcx]" which is "0x48 0x8B 0x01". No extra machine code "bloat" in any callers. The extra 1 memory read is irrelavent if we are about to call malloc() or any of these other WinOS kernel32.dll syscalls. iperlsys.h/perlhost.h does NOT hook anything super perf critical such as "memcmp()" or "memcpy()".
1 parent 569c7cd commit 48bda52

File tree

8 files changed

+720
-760
lines changed

8 files changed

+720
-760
lines changed

embed.fnc

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4206,28 +4206,28 @@ Admp |GV * |gv_SVadd |NULLOK GV *gv
42064206
#endif
42074207
#if defined(PERL_IMPLICIT_SYS)
42084208
CTo |PerlInterpreter *|perl_alloc_using \
4209-
|NN struct IPerlMem *ipM \
4210-
|NN struct IPerlMem *ipMS \
4211-
|NN struct IPerlMem *ipMP \
4212-
|NN struct IPerlEnv *ipE \
4213-
|NN struct IPerlStdIO *ipStd \
4214-
|NN struct IPerlLIO *ipLIO \
4215-
|NN struct IPerlDir *ipD \
4216-
|NN struct IPerlSock *ipS \
4217-
|NN struct IPerlProc *ipP
4209+
|NN const struct IPerlMem **ipM \
4210+
|NN const struct IPerlMem **ipMS \
4211+
|NN const struct IPerlMem **ipMP \
4212+
|NN const struct IPerlEnv **ipE \
4213+
|NN const struct IPerlStdIO **ipStd \
4214+
|NN const struct IPerlLIO **ipLIO \
4215+
|NN const struct IPerlDir **ipD \
4216+
|NN const struct IPerlSock **ipS \
4217+
|NN const struct IPerlProc **ipP
42184218
# if defined(USE_ITHREADS)
42194219
CTo |PerlInterpreter *|perl_clone_using \
42204220
|NN PerlInterpreter *proto_perl \
42214221
|UV flags \
4222-
|NN struct IPerlMem *ipM \
4223-
|NN struct IPerlMem *ipMS \
4224-
|NN struct IPerlMem *ipMP \
4225-
|NN struct IPerlEnv *ipE \
4226-
|NN struct IPerlStdIO *ipStd \
4227-
|NN struct IPerlLIO *ipLIO \
4228-
|NN struct IPerlDir *ipD \
4229-
|NN struct IPerlSock *ipS \
4230-
|NN struct IPerlProc *ipP
4222+
|NN const struct IPerlMem **ipM \
4223+
|NN const struct IPerlMem **ipMS \
4224+
|NN const struct IPerlMem **ipMP \
4225+
|NN const struct IPerlEnv **ipE \
4226+
|NN const struct IPerlStdIO **ipStd \
4227+
|NN const struct IPerlLIO **ipLIO \
4228+
|NN const struct IPerlDir **ipD \
4229+
|NN const struct IPerlSock **ipS \
4230+
|NN const struct IPerlProc **ipP
42314231
# endif
42324232
#else
42334233
Adp |I32 |my_pclose |NULLOK PerlIO *ptr

intrpvar.h

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -874,15 +874,15 @@ PERLVAR(I, psig_ptr, SV **)
874874
PERLVAR(I, psig_name, SV **)
875875

876876
#if defined(PERL_IMPLICIT_SYS)
877-
PERLVAR(I, Mem, struct IPerlMem *)
878-
PERLVAR(I, MemShared, struct IPerlMem *)
879-
PERLVAR(I, MemParse, struct IPerlMem *)
880-
PERLVAR(I, Env, struct IPerlEnv *)
881-
PERLVAR(I, StdIO, struct IPerlStdIO *)
882-
PERLVAR(I, LIO, struct IPerlLIO *)
883-
PERLVAR(I, Dir, struct IPerlDir *)
884-
PERLVAR(I, Sock, struct IPerlSock *)
885-
PERLVAR(I, Proc, struct IPerlProc *)
877+
PERLVAR(I, Mem, const struct IPerlMem **)
878+
PERLVAR(I, MemShared, const struct IPerlMem **)
879+
PERLVAR(I, MemParse, const struct IPerlMem **)
880+
PERLVAR(I, Env, const struct IPerlEnv **)
881+
PERLVAR(I, StdIO, const struct IPerlStdIO **)
882+
PERLVAR(I, LIO, const struct IPerlLIO **)
883+
PERLVAR(I, Dir, const struct IPerlDir **)
884+
PERLVAR(I, Sock, const struct IPerlSock **)
885+
PERLVAR(I, Proc, const struct IPerlProc **)
886886
#endif
887887

888888
PERLVAR(I, ptr_table, PTR_TBL_t *)

iperlsys.h

Lines changed: 369 additions & 370 deletions
Large diffs are not rendered by default.

perl.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -168,18 +168,18 @@ Perl_sys_term(void)
168168

169169
#ifdef PERL_IMPLICIT_SYS
170170
PerlInterpreter *
171-
perl_alloc_using(struct IPerlMem* ipM, struct IPerlMem* ipMS,
172-
struct IPerlMem* ipMP, struct IPerlEnv* ipE,
173-
struct IPerlStdIO* ipStd, struct IPerlLIO* ipLIO,
174-
struct IPerlDir* ipD, struct IPerlSock* ipS,
175-
struct IPerlProc* ipP)
171+
perl_alloc_using(const struct IPerlMem** ipM, const struct IPerlMem** ipMS,
172+
const struct IPerlMem** ipMP, const struct IPerlEnv** ipE,
173+
const struct IPerlStdIO** ipStd, const struct IPerlLIO** ipLIO,
174+
const struct IPerlDir** ipD, const struct IPerlSock** ipS,
175+
const struct IPerlProc** ipP)
176176
{
177177
PerlInterpreter *my_perl;
178178

179179
PERL_ARGS_ASSERT_PERL_ALLOC_USING;
180180

181181
/* Newx() needs interpreter, so call malloc() instead */
182-
my_perl = (PerlInterpreter*)(*ipM->pCalloc)(ipM, 1, sizeof(PerlInterpreter));
182+
my_perl = (PerlInterpreter*)((*ipM)->pCalloc)(ipM, 1, sizeof(PerlInterpreter));
183183
S_init_tls_and_interp(my_perl);
184184
PL_Mem = ipM;
185185
PL_MemShared = ipMS;

proto.h

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

sv.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15708,11 +15708,11 @@ perl_clone(PerlInterpreter *proto_perl, UV flags)
1570815708

1570915709
PerlInterpreter *
1571015710
perl_clone_using(PerlInterpreter *proto_perl, UV flags,
15711-
struct IPerlMem* ipM, struct IPerlMem* ipMS,
15712-
struct IPerlMem* ipMP, struct IPerlEnv* ipE,
15713-
struct IPerlStdIO* ipStd, struct IPerlLIO* ipLIO,
15714-
struct IPerlDir* ipD, struct IPerlSock* ipS,
15715-
struct IPerlProc* ipP)
15711+
const struct IPerlMem** ipM, const struct IPerlMem** ipMS,
15712+
const struct IPerlMem** ipMP, const struct IPerlEnv** ipE,
15713+
const struct IPerlStdIO** ipStd, const struct IPerlLIO** ipLIO,
15714+
const struct IPerlDir** ipD, const struct IPerlSock** ipS,
15715+
const struct IPerlProc** ipP)
1571615716
{
1571715717
/* XXX many of the string copies here can be optimized if they're
1571815718
* constants; they need to be allocated as common memory and just
@@ -15722,7 +15722,7 @@ perl_clone_using(PerlInterpreter *proto_perl, UV flags,
1572215722
CLONE_PARAMS clone_params;
1572315723
CLONE_PARAMS* const param = &clone_params;
1572415724

15725-
PerlInterpreter * const my_perl = (PerlInterpreter*)(*ipM->pMalloc)(ipM, sizeof(PerlInterpreter));
15725+
PerlInterpreter * const my_perl = (PerlInterpreter*)((*ipM)->pMalloc)(ipM, sizeof(PerlInterpreter));
1572615726

1572715727
PERL_ARGS_ASSERT_PERL_CLONE_USING;
1572815728
#else /* !PERL_IMPLICIT_SYS */

0 commit comments

Comments
 (0)