Skip to content

Control characters in text field breaking usql output #509

@trantor

Description

@trantor

Hello @kenshaw .
I've been dealing with control characters that are present in a text field of a database I need to access.
As I can see in this code https://github.com/xo/tblfmt/blame/1af8a162785fd2d26eddb90fbd8ad9d407b3408d/fmt.go#L389 instead of being outputted literally and then, for instance, properly encoded in JSON output, they are rendered as, for instance \x1c for the U+001C character.
Apart from behaving differently from every other tool I've used on the database in question (they all output the literal character), it then completely breaks the JSON output, by putting in it the illegal \x sequence.
The other options in the switch block of the aforementioned code do not seem much better. Is there a way to just get the raw data in the output? I can't find one in the documentation.
There is also the fact that the various output format, JSON for instance, will encode characters in different ways (e.g. the surrogate pairs JSON uses for characters outside the BMP).
Also, silently modifying the contents of the data in an arbitrary way without the user being aware of it does seem an approach prone to nasty surprises for the user.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions