- 
                Notifications
    You must be signed in to change notification settings 
- Fork 8k
          Fix GH-18753: file_get_contents() and file_put_contents() fail with data >=2GB on macOS & BSD
          #18755
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: PHP-8.3
Are you sure you want to change the base?
Conversation
| Please target the  | 
…h data >=2GB on macOS & BSD
| I just realized the problem also affects  | 
file_get_contents() and file_put_contents() fail with data >=2GB on macOS & BSD
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little surprising that they do limit it to INT_MAX, but it is documented (at least in FreeBSD), nor is there an interface that lets you do larger (other than switching to kqueue for everything?). Weird.
| #endif | ||
|  | ||
| #if defined(PHP_WIN32) | ||
| #if defined(PHP_WIN32) || defined(__APPLE__) || defined(__MACH__) || defined(BSD) || defined(__DragonFly__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do a configure-time check for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be just better to invert and allow it only for Linux
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure if Solaris can handle it either (can't be bother to check) so safest option should be to just do #ifdef __linux__ and invert the logic...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be salient to also add a comment explaining why then. (the BSD and Windows cause should explain pretty much all cases).
(FWIW, AIX does seem to allow reads as big as off_t on 64-bit systems...)
| #endif | ||
|  | ||
| #if defined(PHP_WIN32) | ||
| #if defined(PHP_WIN32) || defined(__APPLE__) || defined(__MACH__) || defined(BSD) || defined(__DragonFly__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be just better to invert and allow it only for Linux
| --FILE-- | ||
| <?php | ||
| echo "-- file_put_contents() --\n"; | ||
| echo file_put_contents('bigfile', str_repeat('a', 2 ** 31)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it's quite a lot memory, I wonder if we should add some skip if there is not enough system memory. But it needs to work on Mac to be actually useful and not like https://github.com/php/php-src/blob/91becb304286942337ad65ec614cc0d1a3f79cd6/ext/standard/tests/file/file_get_contents_file_put_contents_5gb.phpt which is actually the reason why we didn't notice this (that test is always skipped on OSX and not sure if it works on FreeBSD either.
| --FILE-- | ||
| <?php | ||
| echo "-- file_put_contents() --\n"; | ||
| echo file_put_contents('bigfile', str_repeat('a', 2 ** 31)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name needs to be test specific - something like gh18753.data for example
| #endif | ||
|  | ||
| #if defined(PHP_WIN32) | ||
| #if defined(PHP_WIN32) || defined(__APPLE__) || defined(__MACH__) || defined(BSD) || defined(__DragonFly__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure if Solaris can handle it either (can't be bother to check) so safest option should be to just do #ifdef __linux__ and invert the logic...
| I did bother to try on illumos yesterday, I forgot to tell but it works without the change. However I can t vouch for solaris which starts to be more and more behind as the time goes so I would be in favor to do what @bukka said, much simpler too. | 
Problem
On Linux
read(3)syscall returns as many bytes as possible, up to requested maximum. Darwin/XNU and BSD kernels instead immediately returnsEINVALif the requested value is larger than nativeINT_MAX. This causesfile_get_contents()to fail with files larger than 2147483647 bytes.Fix
This patch applies the same clamping of buffer/read chunk to 2GB as previously on Windows. This fixes the
file_get_contents(). There's still a way to trigger the problem usingfread()withstream_set_read_buffer()set to 0/unbuffered, but I think this is more of an expected behavior. After this patch the later returns withEBADFinstead ofEINVAL.Affected version
The bug stems affects 8.2 onwards d/t 6beee1a from #8547. However, as this is not a security issue I'm targeting 8.3 as it's latest actively supported version.