Skip to content

Fix percent-encoded drive letter colon (%3A) in Windows file URIs#11293

Open
ActaVerba wants to merge 1 commit intomicrosoft:mainfrom
ActaVerba:fix/dos-path-percent-encoded-colon
Open

Fix percent-encoded drive letter colon (%3A) in Windows file URIs#11293
ActaVerba wants to merge 1 commit intomicrosoft:mainfrom
ActaVerba:fix/dos-path-percent-encoded-colon

Conversation

@ActaVerba
Copy link

Summary

Some LSP clients send file URIs with the drive letter colon percent-encoded as %3A (e.g., file:///c%3A/Users/...). The existing _dosPathRegex in getFilePath() only matches literal colons, causing it to fail to strip the leading slash and decode the path correctly on Windows.

This results in invalid file paths that break path resolution, particularly when using --threads for parallel type checking (where URIs are serialized between worker threads).

Changes

  • Updated _dosPathRegex to also match %3A/%3a after the drive letter: /^\/[a-zA-Z](?::|%3[aA])\//
  • Added decode step to convert %3A/%3a back to : after stripping the leading slash

Test

Before fix:

  • file:///c%3A/Users/project/file.py/c%3A/Users/project/file.py (broken path)

After fix:

  • file:///c%3A/Users/project/file.pyc:/Users/project/file.py (correct path)

Verified with pyright --threads auto on a real Windows project — paths resolve correctly.

Some LSP clients send file URIs with the drive letter colon percent-encoded
as %3A (e.g., file:///c%3A/Users/...). The existing _dosPathRegex only
matches literal colons, causing getFilePath() to fail to strip the leading
slash and decode the path correctly on Windows.

- Update _dosPathRegex to also match %3A/%3a after the drive letter
- Decode percent-encoded colon back to ':' after stripping the leading slash
@rchiodo
Copy link
Collaborator

rchiodo commented Feb 23, 2026

Can you add some unit tests for your scenario to make sure this doesn't regress again? There should be a set of unit tests for URIs.

@edvilme
Copy link

edvilme commented Feb 24, 2026

I wonder if there are other characters being encoded the same way, and if an approach other than string replacement would exist

@rchiodo
Copy link
Collaborator

rchiodo commented Feb 24, 2026

I wonder if there are other characters being encoded the same way, and if an approach other than string replacement would exist

This is the only character we're looking for. It's specifically for finding drives on Windows. I suppose it's possible it could be translated a different way, but I've not seen that before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants