Skip to content

Conversation

kikairoya
Copy link
Contributor

@kikairoya kikairoya commented Oct 17, 2025

lit.util.mkdir and lit.util.mkdir_p were written during the Python 2.x era.
Since modern pathlib functions have similar functionality, we can simply use those instead.

Background:

On Cygwin, a file named file_name.exe can be accessed without the suffix, simply as file_name, as shown below:

$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)

In this situation, while running mkdir file_name works as intended, checking for the existence of the target before calling mkdir incorrectly reports that it already exists and thus skips the directory creation.

$ test -e file_name && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file_name: directory

Therefore, the existence pre-check should be skipped on Cygwin. Instead of add a workaround, refactored them.

@llvmbot
Copy link
Member

llvmbot commented Oct 17, 2025

@llvm/pr-subscribers-testing-tools

Author: Tomohiro Kashiwada (kikairoya)

Changes

On Cygwin, a file named file_name.exe can be accessed without the suffix, simply as file_name, as shown below:

$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)

In this situation, while running mkdir file_name works as intended, checking for the existence of the target before calling mkdir incorrectly reports that it already exists and thus skips the directory creation.

$ test -e file && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file: directory

Therefore, the existence pre-check should be skipped on Cygwin. If the target actually already exists, such an error will be ignored anyway.


Full diff: https://github.com/llvm/llvm-project/pull/163948.diff

1 Files Affected:

  • (modified) llvm/utils/lit/lit/util.py (+1-1)
diff --git a/llvm/utils/lit/lit/util.py b/llvm/utils/lit/lit/util.py
index ce4c3c2df3436..a5181ab20a7e1 100644
--- a/llvm/utils/lit/lit/util.py
+++ b/llvm/utils/lit/lit/util.py
@@ -164,7 +164,7 @@ def mkdir(path):
 def mkdir_p(path):
     """mkdir_p(path) - Make the "path" directory, if it does not exist; this
     will also make directories for any missing parent directories."""
-    if not path or os.path.exists(path):
+    if not path or (sys.platform != "cygwin" and os.path.exists(path)):
         return
 
     parent = os.path.dirname(path)

@kikairoya
Copy link
Contributor Author

cc: @jeremyd2019 @mstorsjo

@arichardson
Copy link
Member

So does that mean any use of os.path.exists() is broken on cygwin? Maybe we could change it to is_dir?

Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe now that we depend on newer python we can just use the exist_ok=True parameter for mkdir?

@kikairoya
Copy link
Contributor Author

So does that mean any use of os.path.exists() is broken on cygwin? Maybe we could change it to is_dir?

I think, in general, most of such checks should be avoided (cf. TOCTOU).

Or maybe now that we depend on newer python we can just use the exist_ok=True parameter for mkdir?

Sure. I'll make a change to use it.

On Cygwin, a file named `file_name.exe` can be accessed without the suffix,
simply as `file_name`, as shown below:

```
$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)
```

In this situation, while running `mkdir file_name` works as intended,
checking for the existence of the target before calling `mkdir`
incorrectly reports that it already exists and thus skips the directory creation.

```
$ test -e file && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file: directory
```

Therefore, the existence pre-check should be skipped on Cygwin.
If the target actually already exists, such an error will be ignored anyway.
@kikairoya
Copy link
Contributor Author

mkdir_p now simply forwards to os.makedirs.
Should it be inlined? It is only called from two locations in TestRunner.py.

mkdir can also be inlined as it is now only called from one location but I'm not sure if the call to CreateDirectoryW can be replaced with just an os.mkdir.

mkdir_p(parent)

mkdir(path)
os.makedirs(path, exist_ok=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much simpler, nice. Not a Windows expert but do we need to remap long paths using something like this?

Suggested change
os.makedirs(path, exist_ok=True)
if platform.system() == "Windows":
if not path.startswith(r"\\?\"):
path = r"\\?\" + path
os.makedirs(path, exist_ok=True)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure - do our python script do such path mangling anywhere else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do in the def mkdir(path): above.

It sounds like python3.6 supports long paths if they are enabled in the registry so there should be no need to work around it here anymore:
https://bugs.python.org/issue27731 -> https://hg.python.org/cpython/rev/26601191b368

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

@arichardson
Copy link
Member

From what I've read pathlib.Path handles long windows paths correctly, so maybe the best solution would be to migrate callers of these helpers towards pathlib?

@kikairoya
Copy link
Contributor Author

OK, I'll try to replace both of mkdir and mkdir_p with direct calls to the pathlib functions.

os.mkdir(path)
except OSError:
e = sys.exc_info()[1]
# ignore EEXIST, which may occur during a race condition
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, all EEXIST errors were ignored.
We should indeed ignore races between concurrent mkdir targets, but I don't see any reason to allow races between touch target and mkdir target.
Since pathlib's exist_ok=True behaves this way, the test shtest-glob.py has been updated to reflect the new behavior.

@kikairoya kikairoya changed the title [LIT][Cygwin] Skip pre-check for existence in mkdir-p [LIT] replace lit.util.mkdir with pathlib.Path.mkdir Oct 18, 2025
Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

# RUN: mkdir %S/example_dir1.new
# RUN: mkdir %S/example_dir2.new

## This mkdir should succeed (so RUN should fail) because the `example_dir*.new`s are directories already exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove `are'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants