Skip to content

Incremental normalization? #60

@jgm

Description

@jgm

According to its documentation, text-icu's collation algorithm uses incremental normalization. This is very helpful in collation: when you're comparing two strings, the decision about how to order them is generally one you can make after the first few characters, so no need to normalize the whole thing.

Could unicode-transforms provide a function that does this? For my purposes, an ideal interface would be

normalizeStreaming :: NormalizationMode -> Text -> [Int]

where the Ints are code points, and the list is produced lazily.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions