dotnet-suggest: Don't duplicate registration entries, don't return false matches, and be more careful about casing

dotnet-suggest is pretty simple in its handling of registrations.

It's just a utf-8 text file with one executable path per line that gets scanned line by line from start to end when searching, returning the last "match" found (which is also just a `StartsWith` match, and thus can return false positives), and which blindly appends to the file on registration, so long as the provided path exists. It also performs its search explicitly as `StringComparison.OrdinalIgnoreCase`.

These have a few undesirable consequences:
 1. Entries are duplicated if the same value is given to `dotnet-suggest register --command-path <command-path>`
 2. The search for a match continues even after a match is found.
 3. False matches can be returned, such as if there is a registration for `/path/to/file` and `/path/to/fileThatIsNotTheOneYouWant`, which will return the ***last*** of those two entries, even if it is `/path/to/fileThatIsNotTheOneYouWant` but it was searching for `/path/to/file`.
 4. The `StringComparison.OrdinalIgnoreCase` string comparison can result in invalid completions on case-sensitive file systems, which could lead to failure to launch the target to retrieve the completions for it, if the command line text does not match the case of the actual executable but does match the case of the entry in the list. Further, if two files differ in name only by case (which is also legal on Windows, BTW), or if you have a directory structure that looks like the following example, false matches can also occur:

```
/a/b    <-- This is a file named b in directory /a
/a/B/c  <-- This is a file named c in directory /a/B/
```
Searching for /a/b will, with the current code, return the second entry as the match, due to both the case-insensitive search and the last-one-wins match strategy.

(1) can be avoided at registration time by just making sure the entry doesn't already exist. It doesn't really matter if the registration operation is made more expensive by doing so, because registration isn't being done frequently, and isn't being done in a context where a brief delay to perform that scan is objectionable.

(2) and (3) can be resolved by returning as soon as a match is found.

It seems to me that it is most important for it to be optimized around the search case, since that is what gets invoked more frequently and in a context wherein the user expects as little delay as possible.

(4) just needs to either have an OSPlatform guard to perform case-sensitive searches on not-windows and case-insensitive searches on Windows, or else simply _always_ perform `StringComparison.Ordinal` searches on all platforms.[^SensitiveSubject]

There are plenty of other pretty low-complexity ways to make it even smarter than the above simplistic suggestions, such as maintaining the list in sorted order when registering and then performing searches by binary search, and/or by using fixed-width lines to make seeking within the file possible without having to read the whole thing into memory (which would make doing a binary search even more efficient for example). But even just de-duplicating it and returning the first prefix match rather than scanning the entire file and then returning the last prefix match indiscriminately would be improvements over the current implementation, and actually result in fewer lines of code than are currently there, since that temporary string variable named `completionTarget` in `FileSuggestionRegistration.FindRegistration()` would go away entirely, along with the extra check for that variable being null that is after the loop.

[^SensitiveSubject]: Windows _is_ case-aware and file naming _is_ case sensitive in Windows, so this would be the most _technically_ correct (the best kind), though of course there is the obvious argument that, since everything _else_ about Windows is case-lenient, enforcing case-sensitivity could be confusing, due to user expectations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dotnet-suggest: Don't duplicate registration entries, don't return false matches, and be more careful about casing #2749

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dotnet-suggest: Don't duplicate registration entries, don't return false matches, and be more careful about casing #2749

Description

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions