Skip to content

Conversation

hasufell
Copy link
Member

@hasufell hasufell commented Mar 8, 2025

No description provided.

@hasufell hasufell force-pushed the long-paths branch 12 times, most recently from a3ec45b to cb96457 Compare March 8, 2025 12:01
-- | Open a file and return the 'Handle'.
openFile :: WindowsPath -> IOMode -> IO Handle
openFile fp iomode = bracketOnError
openFile fp' iomode = (`ioeSetWsPath` fp') `modifyIOError` do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this function is going to have different semantics than the base one even with this PR included.

As far as I see base's Path -> Handle counterpant for Windows is defined at https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L853.

There still going to be extra things that base does and this package doesn't, like some sort of file access optimization, no idea what it does though: https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L875.

There's also going to be handling of locked files that is not done in this package as far as I see, though I haven't checked above openFile function maybe there's something somewhere. Locking is checked after aforementioned optimizations https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L877.

There's also some attempt at truncation if we're overwriting file https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Handle.hsc#L890.

All in all this would mean that file-io is not a drop-in replacement for what base does and someone may be able to observe the difference by switching to file-io just like with long paths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all this would mean that file-io is not a drop-in replacement for what base does and someone may be able to observe the difference by switching to file-io just like with long paths.

Honestly, I don't feel like I want this to be a strict drop-in replacement, especially since this package uses proper Win32 functionality and there is no posix emulation layer involved. Even with the native winIO manager in GHC, there may be subtle differences.

Long path support is definitely something major, so supporting it makes sense.

For other invariants, I think we'd need to find a test that demonstrates different behavior first.

Copy link
Member Author

@hasufell hasufell Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truncation is already tested here when writing over an existing file:

file-io/tests/Properties.hs

Lines 196 to 205 in e2b5ebc

existingFile2' :: Assertion
existingFile2' = do
withSystemTempDirectory "test" $ \baseDir' -> do
baseDir <- OSP.encodeFS baseDir'
let fp = baseDir </> [osp|foo|]
OSP.writeFile fp "test"
r <- try @IOException $ do
OSP.openExistingFile fp WriteMode >>= \h -> BS.hPut h "boo" >> hClose h
OSP.readFile (baseDir </> [osp|foo|])
Right "boo" @=? r

And it seems correct to me, as we set it to truncate here:

WriteMode -> Win32.tRUNCATE_EXISTING
AppendMode -> Win32.oPEN_EXISTING

rightOrError (Right a) = a

-- inlined stuff from directory package
furnishPath :: WindowsPath -> IO WindowsPath
Copy link

@sergv sergv Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a good idea for implementing furnishPath would be to reuse whatever base does via C function here https://github.com/ghc/ghc/blob/fd40eaa17c6ce8716ec2eacc95beae194a935352/libraries/ghc-internal/src/GHC/Internal/IO/Windows/Paths.hs.

Apart from matching semantics it would mean less copying around. The only drawback is that someone will need to check with each ghc release whether the foreign function is still named the same.

Otherwise all those conversions below to and from lists of characters somewhat undermine the cool idea of sticking with byte arrays.

isPathRegular path =
not ('/' `elem` (WS.toChar <$> path) ||
ws "." `elem` WS.splitDirectories (WS.pack path) ||
ws ".." `elem` WS.splitDirectories (WS.pack path))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change

           ws "." `elem` WS.splitDirectories (WS.pack path) ||
           ws ".." `elem` WS.splitDirectories (WS.pack path))

to

           any (\x -> x == ws ".." || x == ws ".") WS.splitDirectories (WS.pack path))

fromExtendedLengthPath ePath =
case WS.unpack ePath of
c1 : c2 : c3 : c4 : path
| (WS.toChar <$> [c1, c2, c3, c4]) == "\\\\?\\" ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to leverage short byte string's isPrefixOf here?

then simplifiedPath
else
case WS.toChar <$> simplifiedPath' of
'\\' : '?' : '?' : '\\' : _ -> simplifiedPath
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to leverage short byte string's isPrefixOf here and below?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants