Skip to content

SVG minification corrupts output when unicode attribute contains 4-byte UTF-8 characters #917

@marmooo

Description

@marmooo

When minifying an SVG file that contains elements with
4-byte UTF-8 characters (U+20000 and above) in the unicode attribute, the output gets corrupted.

Environment:

The issue is reproducible via the Deno (2.7.1) bindings, but not via the CLI.
It was introduced in version 2.24.8 — version 2.24.7 does not have this issue.

To reproduce:

Minify the attached SVG file using the following Deno code:

// problem (2.24.8)
import { minify } from "npm:@tdewolff/minify@2.24.8";
const svg = Deno.readTextFileSync("test.svg");
const minified = await minify("image/svg+xml", svg);
console.log(minified);

The following works correctly:

import { string } from "npm:@tdewolff/minify@2.24.7";
const svg = Deno.readTextFileSync("test.svg");
const minified = await string("image/svg+xml", svg);
console.log(minified);

The output of 2.24.8 will be corrupted around the following attribute:

unicode="𨮓"

Observed behavior:

The issue seems to be related to the byte offset of the 4-byte character within the file:

  • Adding or removing 3 or more characters before unicode="𨮓" makes the error disappear
  • Adding characters after unicode="𨮓" (e.g. unicode="𨮓 aaaaaaaaa") does not fix the issue
  • The issue cannot be reproduced with a single element in isolation. It only occurs when the overall file size puts the 4-byte character at a specific offset

This suggests the problem is tied to the specific byte position of the 4-byte UTF-8 character in the file, not the surrounding context.
Since the issue was introduced in 2.24.8, it may be related to changes made in that version.

Workaround:

Replacing the character with a numeric character reference before minifying avoids the issue:

unicode="𨦓"

The SVG file that reproduces the issue is attached below.
Note that it is an SVG font file and will not render visually in a browser..

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions