Skip to content

Conversation

@g-branden-robinson
Copy link
Contributor

@g-branden-robinson g-branden-robinson commented Nov 10, 2025

src/cmd/tbl: add warn() function

Up to this point, tbl has had no nonfatal diagnostics. A forthcoming
change requires one.

src/cmd/tbl/t4.c: be robust to extensions

DWB 3.3 tbl and GNU tbl support an 'x' column modifier. (Heirloom
Doctools tbl and mandoc(1) also support it.) Plan 9 tbl aborts
processing of the entire input document if it encounters this characater
in a table description.

That's overkill; it's safe to ignore that modifier if it really is a
modifier, and the column descriptor has a classifier already, like 'L'
or 'N'. The sawchar local variable, of Boolean sense, in readspec()
appears to indicate that fact.

Many man pages in the wild, such as those of ncurses, use the 'x'
modifier; while this change doesn't make Plan 9 tbl implement the
extension, it does help Plan 9 render such pages on a best-effort basis.
Extending this toleration also permits man page authors to wangle a
portable hack into their pages by specifying an explicit width followed
by the 'x' modifier that implementations supporting it will honor
(example: 'Lw(22n)x').

The GNU tbl(1) man page says:

Column modifiers
Any number of modifiers can follow a column classifier. Modifier
arguments, where accepted, are case‐sensitive. If a given modifier
is applied to a classifier more than once, or if conflicting
modifiers are applied, only the last occurrence has effect. The
modifier x is mutually exclusive with e and w, but e is not
mutually exclusive with w; if these are used in combination,
x unsets both e and w, while either e or w overrides x.

Heirloom Doctools and mandoc(1) behave compatibly with the foregoing
description.

Before:

$ printf '.TS\nLx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: bad table specification character x
tbl quits

After:

$ printf '.TS\nLx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: warning: unrecognized column modifier character 'x'
     table cell

$ printf '.TS\nx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: bad table specification character x
tbl quits

$ printf '.TS\ntab(@);\nL Lx.\ncell one@cell two\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:3: warning: unrecognized column modifier character 'x'
     cell one   cell two

$ printf '.TS\ntab(@);\nL,x.\ncell one@cell two\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:3: bad table specification character x
tbl quits

I confess to using my own terminology rather than Mike Lesk's. See
https://man7.org/linux/man-pages/man1/tbl.1.html.

Up to this point, tbl has had no nonfatal diagnostics.  A forthcoming
change requires one.
DWB 3.3 tbl and GNU tbl support an 'x' column modifier.  (Heirloom
Doctools tbl and mandoc(1) also support it.)  Plan 9 tbl aborts
processing of the entire input document if it encounters this characater
in a table description.

That's overkill; it's safe to ignore that modifier if it really is a
_modifier_, and the column descriptor has a classifier already, like 'L'
or 'N'.  The `sawchar` local variable, of Boolean sense, in `readspec()`
appears to indicate that fact.

Many man pages in the wild, such as those of ncurses, use the 'x'
modifier; while this change doesn't make Plan 9 tbl implement the
extension, it does help Plan 9 render such pages on a best-effort basis.
Extending this toleration also permits man page authors to wangle a
portable hack into their pages by specifying an explicit width followed
by the 'x' modifier that implementations supporting it will honor
(example: 'Lw(22n)x').

The GNU tbl(1) man page says:
   Column modifiers
     Any number of modifiers can follow a column classifier.  Modifier
     arguments, where accepted, are case‐sensitive.  If a given modifier
     is applied to a classifier more than once, or if conflicting
     modifiers are applied, only the last occurrence has effect.  The
     modifier x is mutually exclusive with e and w, but e is not
     mutually exclusive with w; if these are used in combination,
     x unsets both e and w, while either e or w overrides x.

Heirloom Doctools and mandoc(1) behave compatibly with the foregoing
description.

Before:
$ printf '.TS\nLx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: bad table specification character x
tbl quits

After:
$ printf '.TS\nLx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: warning: unrecognized column modifier character 'x'
     table cell

$ printf '.TS\nx.\ntable cell\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:2: bad table specification character x
tbl quits

$ printf '.TS\ntab(@);\nL Lx.\ncell one@cell two\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:3: warning: unrecognized column modifier character 'x'
     cell one   cell two

$ printf '.TS\ntab(@);\nL,x.\ncell one@cell two\n.TE\n' | 9 tbl | 9 nroff -man | col | cat -s

Input:3: bad table specification character x
tbl quits

I confess to using my own terminology rather than Mike Lesk's.  See
<https://man7.org/linux/man-pages/man1/tbl.1.html>.
@g-branden-robinson g-branden-robinson force-pushed the tbl-make-unrecognized-column-mods-less-lethal branch from b0b03f8 to 6ab4f39 Compare November 10, 2025 17:37
bernhard-voelker pushed a commit to bernhard-voelker/findutils that referenced this pull request Jan 6, 2026
Plan 9 and Solaris tbl, like Seventh Edition Unix tbl, do not support
the 'x' column modifier.  This extension appeared in DWB tbl by version
3.3 and early in GNU troff development (both circa 1990).  I suspect,
but do not know, that other System V Unix tbl programs don't support it
either.

These old tbl programs are brutal when they encounter an unsupported
column modifier--they abort the preprocessor altogether ("tbl quits")
without attempting recovery.[1]  Because tbl works as a filter, like
eqn, pic, soelim, or more familiar Unix tools (cat, sed, nl), this means
that tbl truncated the entire remainder of the input at that point.  GNU
tbl is more robust, and discards input only until the next `.TE` token.

Due to this rudeness it's impossible to portably use 'x' without
rewriting the page text, and I know of no good way to parameterize a
table format.  (tbl(1) doesn't have variables or anything like a macro
preprocessor.  *roff strings are no use because tbl is a _pre_processor
for troff.)

To portably use 'x' requires a man page to test the underlying
implementation and potentially rewrite the page prior to installing it.

See a recent patch of mine to ncurses (merged in its 20251115 release)
for an approach potentially adaptable to findutils.

https://lists.gnu.org/archive/html/bug-ncurses/2025-11/msg00035.html

[1] I've proposed a merge request to Plan 9 from User Space to make its
    tbl less intolerant.  Even if accepted, that won't help anyone who
    uses other "legacy" troffs.

    9fans/plan9port#739

* find/find.1 (Functional Changes): Drop 'x' table modifier.

Discussed at:
https://lists.gnu.org/r/bug-findutils/2025-11/msg00094.html

Copyright-paperwork-exempt: Yes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant