Skip to content

meta-bug whiteboard MSVC turning on builtins memcpy memset memcmp strlen #23384

@bulk88

Description

@bulk88

Description

This might eventually become a PR by me or someone else who has incentive to tackle this. But I need to archive the backstory and engineering details somewhere.

Research notes to myself about turning on memset/memcpy/memcmp CPU intrinsics feature inside WinPerl built with MSVC.

A naive person would say, are you stupid, it takes 20 seconds to add -Oi to /win32/GNUmakefile and this job/bug/task is done.

-Oi docs: https://learn.microsoft.com/en-us/cpp/build/reference/oi-generate-intrinsic-functions?view=msvc-170

I do not want to hit the -Oi button globally for many technical engineering reasons. If I do it, I know I am invasively altering the entire WinPerl+MSVC ecosystem forever with that button.

Also Perl in XS/C is not written in C89/C99 language. "Perl in XS/C" is written in Peroost Framework (jkjk) which is P5P's clone of https://www.boost.org/libraries/latest/grid/ . Peroost Framework's API Docs are located here -> https://perldoc.perl.org/perlclib .

P5P can modify Peroost, has modified, and alot of time and engineering went into creating Peroost's/XS's .h files. P5P and libperl.so.dll and ./Configure and metaconfig and miniperl.exe and the general toolchain can do many things, or automate, or improve, or correct, alot of things, defects, mousetraps, flaws, poor dev decisions in C lang, that you dont get with a stock CC toolchain.

I'm not going to go into details why Im not hitting the -Oi button right now.

Research notes:

A line of code somewhere in mro.xs.

AV* const isa_lin = newAV_alloc_xz(4);

expands to

#define newAV_alloc_xz(size) av_new_alloc(size,1)

expands to

PERL_STATIC_INLINE AV *
Perl_av_new_alloc(pTHX_ SSize_t size, bool zeroflag)
{
    AV * const av = newAV();
    SV** ary;
    PERL_ARGS_ASSERT_AV_NEW_ALLOC;
    assert(size > 0);

    Newx(ary, size, SV*); /* Newx performs the memwrap check */
    AvALLOC(av) = ary;
    AvARRAY(av) = ary;
    AvMAX(av) = size - 1;

    if (zeroflag)
        Zero(ary, size, SV*);

    return av;
}

Now lets analyze Zero(ary, size, SV*); in detail. After inlining it becomes

memset(array, 0, 0x20);

default MSVC WinPerl for the last 25 years emits

00000001800015EF 33 D2                                   xor     edx, edx        ; Val
00000001800015F5 44 8D 42 20                             lea     r8d, [rdx+20h]  ; Size
0000000180001608 48 8B C8                                mov     rcx, rax        ; Dst
000000018000160B FF 15 87 2A 00 00                       call    cs:__imp_memset

now I hack perl core headers with

#pragma intrinsic(memcpy)
#pragma intrinsic(memset)
#pragma intrinsic(memcmp)
#pragma intrinsic(strcat
#pragma intrinsic(strcmp)
#pragma intrinsic(strcpy)
#pragma intrinsic(strlen)

and recompile my xs module and I get

0000000180001926 0F 57 C0                                xorps   xmm0, xmm0
0000000180001945 0F 11 00                                movups  xmmword ptr [rax], xmm0
0000000180001948 0F 11 40 10                             movups  xmmword ptr [rax+10h], xmm0

lets count the bytes

before 2 + 4 + 3 + 6 = 15
after 3 + 3 + 4 = 10

Result: inlined memset() with basic SSE 1.0 ops wins the game.

It was a win performance wise, since it didn't do the formalities of a C call stack frame, and the large switch tree that lives inside libc function memset(), that creates an bounds checked aligned pointer, from an unaligned pointer, with something like

switch(ptr & 0xf) {
    case: 16
    case: 15
    case: 14
    case: 13
    case: 12
    //etc
}

didn't execute.

It was a win machine code bloat wise. 5 bytes shorter to do the same thing.

Steps to Reproduce

Disassembly a WinPerl compiled with MSVC, or with VS IDE, right click on ur src code -> left click "Go To Disassembly", press F11 a couple 100 times, or just press and hold F11 for a while with a podcast playing or a TV in the background.

Expected behavior

Code gen that is more like current LinPerl, or codegen that is more GCC or LLVM like than what MSVC produced currently for WinPerl.

Perl configuration

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions