Skip to content

fix(printf): redesign %q implementation for gnu compatibility#9640

Open
naoNao89 wants to merge 1 commit intouutils:mainfrom
naoNao89:fix-printf-q-9638
Open

fix(printf): redesign %q implementation for gnu compatibility#9640
naoNao89 wants to merge 1 commit intouutils:mainfrom
naoNao89:fix-printf-q-9638

Conversation

@naoNao89
Copy link
Contributor

Fixes #9638

Redesigned printf %q implementation to match bash behavior. Previous approach incorrectly used SHELL_ESCAPE (designed for ls). Created dedicated PrintfQuoter with proper algorithm: empty→'', simple→unchanged, metacharacters→backslash, control→$'...'. Includes 18 tests and related apostrophe bug fix.

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 12, 2025

Merging this PR will improve performance by 3.7%

⚡ 1 improved benchmark
✅ 287 untouched benchmarks
⏩ 38 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation cp_large_file[16] 393.3 µs 379.3 µs +3.7%

Comparing naoNao89:fix-printf-q-9638 (afb3f3b) with main (bd58575)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?

@naoNao89 naoNao89 force-pushed the fix-printf-q-9638 branch 2 times, most recently from 04050fa to ac19251 Compare December 12, 2025 01:42
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@oech3
Copy link
Contributor

oech3 commented Dec 12, 2025

Please ask GNU before opening/merging this. bash is not coreutils.

$ bash-builtin-printf "%q\n" $'\001'\'$'\001'
$'\001\'\001'
$ uu-printf "%q\n" $'\001'\'$'\001'
"'$'\001''''$'\001"
$ archlinux-gnu-printf "%q\n" $'\001'\'$'\001'
''$'\001'\'''$'\001'

@collinfunk

@naoNao89
Copy link
Contributor Author

💀

@naoNao89
Copy link
Contributor Author

naoNao89 commented Dec 12, 2025

# bash 5.2.37's builtin printf
prompt$ printf "%q\n" $'\001'\'$'\001'
$'\001\'\001'
prompt$ printf "%s" $'\001\'\001' | od -t x1
0000000 01 27 01
0000003
# GNU Coreutils current main HEAD's printf
prompt$ ./src/printf "%q\n" $'\001'\'$'\001'
''$'\001'\'''$'\001'
prompt$ printf "%s" ''$'\001'\'''$'\001' | od -t x1
0000000 01 27 01
0000003
./target/release/coreutils printf "%q\n" $'\001'\'$'\001'
$'\001\'\001'

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/printf/printf-quote (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@egmontkob
Copy link

Thanks for working on this!

Previous approach incorrectly used SHELL_ESCAPE (designed for ls). Created dedicated PrintfQuoter [...]

I don't think it's the right approach. The naming of the cmdline option ls --quoting-style=shell-escape suggests to me that its exact purpose is to provide an output format that's properly escaped to be copy-pasteable into a shell command line. That is, escaped identically to printf %q.

I think reusing SHELL_ESCAPE (designed for ls) was the right thing to do, it's just that its implementation is incorrect. You should have a single code for handling these two cases, not two different ones.

I believe GNU Coreutils also fixed these two hand-in-hand, even though the NEWS entry doesn't specify the exact cmdline option of ls:

  ls and printf fix shell quoted output in the edge case of escaped
  first and last characters, and single quotes in the string.

The story is more complicated than this since there are 4 different relevant formats of ls: shell, shell-escape, shell-always, shell-escape-always; I'm not sure what they all do exactly, nor which ones were exactly affected by that recent fix in GNU Coreutils, but I'm fairly certain that printf %q shouldn't have yet another quoting style, it should share one (shell-escape I think) with ls. Or rather, semantically the other way around: printf %q's goal is clear, and I believe that one (or some?) of ls's shell-related quoting styles should match that.

@naoNao89
Copy link
Contributor Author

naoNao89 commented Dec 12, 2025

tks, found it

pub fn printf_quote(name: &OsStr) -> OsString {
    escape_name(name, QuotingStyle::SHELL_ESCAPE,

@naoNao89
Copy link
Contributor Author

naoNao89 commented Dec 12, 2025

use shared EscapedShellQuoter instead of separate PrintfQuoter,result: 12/21 tests pass. EscapedShellQuoter has bugs that affect both tools.

@pixelb
Copy link

pixelb commented Dec 12, 2025

Some notes from GNU ...

The difference between shell and shell-escape is that "shell-escape" uses POSIX $'\xxx' syntax to output non-printing characters, whereas the "shell" just outputs ? for such characters. I.e. shell-escape is the most general in that its output should always be copy and pasteable back to the shell to specify any file name. "shell-escape" is the default mode in ls when outputting to a tty. Also this is what printf %q uses since it's the most general. I'll expand on this a bit in the gnu docs.

So, yes ideally there would not be a separate implementation of ls --quoting-style and printf %q.

As for conciseness of output, the GNU output can have redundant leading single quotes, though that is for subtle alignment reasons. BTW folks can be very sensitive about this. I got threatening personal voicemails when I changed this alignment output slightly 10 years ago now. For example:

$ mkdir ta
$ touch ta/foo$'\001'bar ta/barfoo ta/foobar
$ ls -l ta
-rw-r--r--. 1 padraig padraig 0 Dec 12 15:00  barfoo
-rw-r--r--. 1 padraig padraig 0 Dec 12 15:00 'foo'$'\001''bar'
-rw-r--r--. 1 padraig padraig 0 Dec 12 15:00  foobar

Notice how the file names are aligned, whereas if we used the more concise foo$'\001'bar or even bash's $'foo\001bar' the alignment / visual indication of quoting would be different. But yes we probably could make the quoting more concise in some cases, like removing redundant leading '' for example, but saying that it's an edge case, so we didn't address that.

@egmontkob
Copy link

I got threatening personal voicemails

What the f.ck... I'm so sorry to hear this!

Thanks for the additional info!

There are multiple correct behaviors, and for ls some subjectcive visual beauty is surely preferable (which doesn't only influence the exact quoting method but also the indentation). For printf %q I'd argue that it doesn't matter since it's supposed to be used in scripts, its output hardly ever appears on the UI, so it should probably just stick to whatever format was picked for ls.

I don't think it's a problem if the two coreutilses (how to say this correctly? :)) use different format, as long as they are both technically correct (the shell resolves them to the same original string), but surely following GNU Coreutils is one possible reasonable choice.

@naoNao89 naoNao89 force-pushed the fix-printf-q-9638 branch 2 times, most recently from 92a9d4b to dd82476 Compare December 12, 2025 18:38
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/ls/quote-align. tests/ls/quote-align is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/symlink-quote. tests/ls/symlink-quote is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/wc/wc-files0. tests/wc/wc-files0 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/wc/wc-files0-from. tests/wc/wc-files0-from is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/printf/printf-quote (fails in this run but passes in the 'main' branch)

@naoNao89 naoNao89 marked this pull request as draft December 12, 2025 20:49
@naoNao89 naoNao89 changed the title fix(printf): redesign %q implementation for bash compatibility fix(printf): redesign %q implementation for gnu compatibility Dec 13, 2025
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/printf/printf-quote (fails in this run but passes in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/printf/printf-quote (fails in this run but passes in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/printf/printf-quote (fails in this run but passes in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

1 similar comment
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 force-pushed the fix-printf-q-9638 branch 3 times, most recently from 1e6ef3e to aec867c Compare December 15, 2025 20:42
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/tail/inotify-dir-recreate is now passing!

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 17, 2025
Fixes CI failure from PR uutils#9640 where GNU test suite update (commit ba3442f)
exposed fundamental design flaws in printf %q shell-quoting implementation.

Problem:
Original code pre-scanned for control characters and wrapped ENTIRE strings
in $'...' if ANY control char was present (e.g., "a\r" → $'a\r').

Solution:
Implemented selective quoting that only wraps control characters themselves
(e.g., "a\r" → a$'\r'), matching GNU coreutils behavior.

Key Changes:
- Removed has_control_chars() pre-scanning logic
- Never start in dollar mode - enter/exit dynamically
- Exit dollar mode when encountering regular chars (selective quoting)
- Keep consecutive control chars in single dollar quote
- Handle apostrophes by exiting dollar mode and using \' escape
- Updated test expectations to match selective quoting behavior

Examples:
- 'a\tb' → a$'\t'b (not $'a\tb')
- '\x01\x02\x03' → $'\001\002\003' (not $'\001'$'\002'$'\003')
- 'hello\x01world' → hello$'\001'world (not $'hello\001world')
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tail/inotify-dir-recreate is now passing!

@sylvestre
Copy link
Contributor

@naoNao89 is it still a draft ? thanks

@naoNao89 naoNao89 marked this pull request as ready for review January 1, 2026 23:31
@naoNao89
Copy link
Contributor Author

naoNao89 commented Jan 3, 2026

readyy

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Fixes CI failure from PR uutils#9640 where GNU test suite update (commit ba3442f)
exposed fundamental design flaws in printf %q shell-quoting implementation.

Problem:
Original code pre-scanned for control characters and wrapped ENTIRE strings
in $'...' if ANY control char was present (e.g., "a\r" → $'a\r').

Solution:
Implemented selective quoting that only wraps control characters themselves
(e.g., "a\r" → a$'\r'), matching GNU coreutils behavior.

Key Changes:
- Removed has_control_chars() pre-scanning logic
- Never start in dollar mode - enter/exit dynamically
- Exit dollar mode when encountering regular chars (selective quoting)
- Keep consecutive control chars in single dollar quote
- Handle apostrophes by exiting dollar mode and using \' escape
- Updated test expectations to match selective quoting behavior

Examples:
- 'a\tb' → a$'\t'b (not $'a\tb')
- '\x01\x02\x03' → $'\001\002\003' (not $'\001'$'\002'$'\003')
- 'hello\x01world' → hello$'\001'world (not $'hello\001world')
@naoNao89 naoNao89 force-pushed the fix-printf-q-9638 branch 2 times, most recently from f9b78e2 to 76194d6 Compare February 17, 2026 01:24
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/date/date-locale-hour. tests/date/date-locale-hour is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/tail/retry is no longer failing!
Note: The gnu test tests/rm/many-dir-entries-vs-OOM is now being skipped but was previously passing.

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/cut/bounded-memory. tests/cut/bounded-memory is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/retry is no longer failing!
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/date/date-locale-hour. tests/date/date-locale-hour is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/sleep. tests/misc/sleep is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/retry is no longer failing!
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.

Implement printf %q format specifier to match bash behavior for shell-escaping strings.

Changes:
- Fix integer overflow panic in extreme field width parsing
- Update quoting logic for control characters with single quotes
- Adjust test expectations for GNU compatibility

Fixes uutils#9638
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/printf/printf-quote. tests/printf/printf-quote is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/retry is no longer failing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

printf: "%q" $'\001'\'$'\001' produces incorrect output

5 participants

Comments