-
Notifications
You must be signed in to change notification settings - Fork 601
win32_isatty() dont call a mostly failing syscall, NT->WIN err conv is slow #23375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
bulk88
commented
Jun 17, 2025
- This set of changes does not require a perldelta entry.
|
It definitely speeds up the test against file type file handles, from ~1.2µs to 0.39µs per call. It slows down the test against tty handles a little, from ~23.8-24.5µs to 23.4-25.0µs. |
Good to know it was a verified benchmark improvement? How did you measure it? Just curious. Maybe I can do it as routine. I didn't benchmark it myself, but that RtlNtErr2WinErr() function made me furious single stepping its Asm code. Its a 0-65000 Its probably worse than I have another patch somewhere that cuts down the number win32_isatty() calls coming from the POSIX-y PerlIO .c code by an order of magnitude (90%), but I want to get this patch in first, which makes the win32_isatty() impl better, regardless of goodness or badness, of whatever the caller frame's code is doing. The patch that removes The Win32 Console APIs aren't known for being I/O speed demons. Plus waking up I could've done Native API/Asm style optimizations inside win32_isatty() but I decided that is a bad idea, MS in late Win10s era/Win 11 era has done heavy refactoring on Safer and easier and less thinking and less work to off load responsibility for all the shortcut tricks to
Remember MS specifically says Win RT Apps/MS Mobile App Store walled garden Apps, probably are forbidden from C linking against I'd have to double check, but I believe the MS's API Sets and
random facts: |
It wasn't anything too rigorous: tested 3 times each with blead and with your change. blead results (Ryzen 7 2700): with your change: Debian (i7-10700F): Debian (WSL2, Ryzen 7 2700, same hardware as Windows above) |
|
I would merge this if the commit message is improved enough. But I object to its merging as-is. The code itself looks fine to me. First, the commit message title is too long. GH was forced to wrap it, using ellipses. And what is there doesn't really help me understand what's happening. It is actually factually wrong. This commit doesn't stop calling any syscall. It inserts another syscall first and avoids calling the original one if that one fails. It also assumes a more intimate knowledge of Windows internals than I possess, and I'm sure I'm not alone. "NT->Winn err conv is slow" is something I can guess at what it means. But it shouldn't be in a commit title Second the commit message body is non-existent, and the comments refer the reader to the p.r. for details. The comments should not refer to an unspecified GH p.r. that someone would have to take steps to track down. It is ok to refer to the commit message that created them. But making a later reader have to go through the extra level of indirection is unacceptable. Third, the p.r. description isn't very helpful. People reading this want to know what is changing and why. Starting off with a description of the internals of a Windows library function does not meet that need. I myself would not have included it, but if you feel that background is helpful to people more attuned to Windows internals than I and most of the people who will ever read the message, then it should be placed in a separate paragraph later. The p.r. description should be copied into the commit message in this case. And it is a non-sequitor with its title. Its first sentence needs to expand on what the title says. It doesn't currently. What it looks to me is that the commit basically finds many failures using a faster but incomplete syscall before falling back to the slow complete one. But I wouldn't have figured that out from any of your descriptions. The "Let them read code" attitude is not a principal compatible with this project. (Today I learned that Marie Antoinette did not say the similar phrase attributed to her; that claim was first made 50 years after her execution.) Writing a good commit message for anything but trivial changes takes some effort. We require that pull requests have had sufficient effort not just in the code but in its comments and descriptions. |
| return 0; | ||
| } | ||
|
|
||
| if (GetConsoleMode(fh, &mode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the description says there is a syscall that mostly fails. There is nothing that explains that statement, so we are left to guess about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is below, GetConsoleMode() only accepts CONSOLE handles, not file handles, not serial ports, not TCPIP sockets, not process handles, not thread handles, not mutexes
On NT Kernel, what Perl calls a Pack::age/HV*/stash, and every other lang calls it a class. This is a Pack::age/HV*/stash inside NT Kernel
https://doxygen.reactos.org/d7/dca/ndk_2obtypes_8h_source.html#l00379
NT has 6 built in Classes that can I am calling "I/O" objects, they accept create/read/write/seek/close/delete IOCTLs, unix calls these things a file descriptor
https://doxygen.reactos.org/da/d7d/iomgr_8c_source.html#l00254
but unlike Unix, with its unreaped memory leaking zombie PID API, and you can't select() on a EXT2/EXT3 file on a 5400 RPM for some bizarre design decision, and Linux/BSD/RIP Solaris still can't figure out how to do select() on a disk file after 20 years
https://www.upwind.io/feed/io_uring-linux-performance-boost-or-security-headache
my meatspace dev friend won an iouring bug bounty last year but s/he is not a public figure and is not googleable
https://doxygen.reactos.org/df/d04/cmsysini_8c_source.html#l00980 the registry classes register themselves with HV* and *main::`
https://doxygen.reactos.org/d1/d6e/ntoskrnl_2ex_2callback_8c_source.html#l00256 here comes the APC/C function pointer closure class (posix calls them signals)
https://doxygen.reactos.org/d4/deb/ntoskrnl_2ex_2event_8c_source.html#l00039 Now we get the Kernel and User mode "Event" object, so now a Ring 0 or 3 thread, can de-schedule itself off the CPU, unlike MSDOS 8086/286 era, where there is no way on earth to stop the CPU from sucking in machine code and executing that code
https://doxygen.reactos.org/de/d7a/ntoskrnl_2ex_2timer_8c_source.html#l00223 Now we have wall time objects!
https://doxygen.reactos.org/d9/d6e/win32k_8c_source.html#l00259 And welcome to the Windows in Windows, now we have a VGA/DP/DVI port and can show things to humans
Shells Terminals TUIs and GUIs are ridiculous concepts to be baked into an OS's public API. Windows's late 1980s architecture made them end user plugins. The original Windows NT Kernel GUI design, in 3.1-3.51, the VGA driver/mouse/keyboard/screen/GUI was a ring 3 userland process that probably used kernel named pipes to talk to other processes and the VGA adapter. The design was so horrible with performance, in NT 4 win32k.sys was invented and still exists in Win 10 basically unmodified to paint the GUI pixels. win32k.sys is the ONLY DRIVER/only disk file, that is allowed to have a range of precious hardware/CPU syscall constants, specifically the x64 sysenter instruction constants, or i386 interrupt 21h constants.
Every other kernel "Class", written by MS and burned into the kernel, or a driver file written by the general public, must accept the rules that "IRP"s, im calling them asynchronous ioctl packets, or event queue packets, is the only way to communicate with userland. Its very organized. There is a 2nd way dis recommended way to talk to userland, that I believe has been banned forever by MS's signed kernel driver program, probably around Win 8 or Win 10 era.
Windows Services for Linux was engineering wise/software/technically impossible to do, until all Windows 3rd party Hardware or Software vendors who write Kernel drivers, aged out, got banned by MS, or those HW/SW vendors left the market (no 64 bit drivers available), or went bankrupt, or actually PAID humans to rewrite and recompile the Win NT kernel drivers for a USB WebCam, or a $1.99 ethernet card with a Realtek chip from hell that an army of Linux HW devs for 10 years haven't been able to get stable.
WSL and unmodified ELF files executing on WinNT, only became possible, once ALL, and I mean ALL, lines of code, written by anyone, in the NT kernel, agreed to never ever ever again open a mmap portal to a process and inspect its address space for "known C structs at fixed userland virtual addresses".
This post is too long, ill let someone else do the talking.




So lets go back to my PERL scripts on NT and this PERL.EXE program I have.
Why would passing any NT Handle number but a Console handle number into GetConsoleMode(), make GetConsoleMode() return TRUE/SUCCESS?
I can reverse the question, why are syscalls tell() and seek() returning -1 on my Linux VM for FD 1 between PERL and xterm?
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/tty/tty_io.c
Will the ext3 FS driver give me /usr/bin/perl's baudrate and ECHO and vertical tab delay
help!!!! Im a lawyer but a real estate that accidentally took on a murder trial client, What do I do now?
what is a TERMINAL HANGUP SLAVE in Linux? so not PC src code lol
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/tty/tty_io.c#n2306
What is the window size of /lib/strict.pm?
LETS FIND OUT!!!!
IM GOING TO SPEAK PENGUINESE FOR ONCE
Because most of P5P only speaks PENGUIN or TUX, maybe I say what the problem with WINPERL is in PENGUIN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The post above is not technically/engineering wise accurate. On Windows Console Handles/Console File Descriptors are 100% fake, and just userland Ring 3 magic tricks, exactly like WinPerl's Pseudo-fork is 100% fake user land magic tricks. A Windows console handle is an illegal unaligned pointer ending with the digit 0x01, All Windows kernel handles are aligned U32 * offsets into an array somewhere in Ring 0. So a U32 ending with 0x00 0x04 0x08 0x0Cis a real NT/POSIX kernel handle, anything ending with0x01, 0x02, 0x03 is illegal. Console handles in MS's public API always end with 0x01. Reverse engineering the Windows OS will show STDIN STDOUT and STDERR are https://en.wikipedia.org/wiki/WebSocket packets over a TCPIP socket to another process. Thats why its so slow, and even TonyC proved it with benchmarks. The only way to know if a Console Handle (a U32 int between 0-4GB) is "real" or its "buffer mode" or "code page" is over a TCPIP socket to a another process called csrss.exe that is a daemon/root privilages/dark magic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was excessive.
Your commit message (and the brain dump just above much more so) goes into irrelevant technical detail, when indicating that GetConsoleMode() is slow even for non-console handles and the GetFileType() shortcuts that.
and maybe a benchmark.
So your commit message might be something like:
win32_isatty(): only call expensive GetConsoleMode() for character devices
win32_isatty() is called from win32_read() so it is called often and often called on
non-TTY handles, so performance for non-TTY handles is important.
GetConsoleMode() is expensive even for non-character handles, we can use
GetFileType() to cheaply distinguish the common case of non-character device
handles from character device handles, since only character devices can be TTYs.
For a rough benchmark this improved performance from roughly 1.24µs per call
to 0.39µs per call on a non-character handle on a Ryzen 7 2700 Windows 10
Calls for console device handles were slightly slowed, from 24.0µs per call to 24.5µs
per call though these results were fairly noisy.
|
Interesting callstack I happened to see. WinPerl's |
The UCRT just checks a flag (and that flag only indicates a character device from what I can see) |




