Skip to content

Multi-valued Headers #94

@codello

Description

@codello

Suggestion

I'd like to standardize the notion of multi-valued headers, i.e. headers that can have more than one singular value.

Motivation

Currently there are some headers that support multiple values: #GENRE, #LANGUAGE, #EDITION, #TAGS, #CREATOR. All of them separate multiple values via commas. Other headers currently don't support multiple values, but there might be more such cases future (the issue of multiple artists has come up in the Discord recently). I think standardizing the notion of multi-valued headers across these cases can simplify the spec and make it easier to introduce additional multi-value fields in the future.

I have brought up this suggestion in the past (#22 (comment), previous version of spec.md) but I don't think it has been discussed yet as a general suggestion yet.

Syntax

I'm proposing the following syntax, which I find quite intuitive:

Multi-valued headers can contain multiple values separated by a comma (%x2C). Additionally, multiple occurrences of a multi-valued header are semantically equivalent to a single occurrence where all values are concatenated by commas in order of occurrence.
To include a literal comma in a multi-valued header, use two subsequent commas.
Implementations MAY remove leading and trailing whitespace of individual values of a multi-valued header without changing sematics. Implementations MAY also remove empty values in a multi-valued header without changing semantics.

header-value = single-value / multi-value
multi-value  = single-value *( comma single-value ) [ comma ]
single-value = *( header-char / colon / comma comma )

header-char  = %x00-09 /  ; exclude line feed
               %x0B-0C /  ; exclude carriage return
               %x0E-2B /  ; exclude single comma
               %x2D-39 /  ; exclude colon
               %x3B-10FFFF

Here are some examples what this could look like (each with an equivalent JSON):

Simple Multi-Valued Header
#TITLE: A Title
#GENRE: Rock, Pop, Country
{"TITLE": "A Title", "GENRE": ["Rock", "Pop", "Country"]}
Repeated Multi-Valued Header
#TITLE: A Title
#GENRE: Rock
#GENRE: Pop,
#GENRE: ,Country
#GENRE:
{"TITLE": "A Title", "GENRE": ["Rock", "Pop", "Country"]}
Literal Comma
#TITLE: A Title
#GENRE: Rock,, Pop
#GENRE: Country
{"TITLE": "A Title", "GENRE": ["Rock, Pop", "Country"]}
Empty Values
#TITLE: A Title
#GENRE: Rock, ,Pop, Country
{"TITLE": "A Title", "GENRE": ["Rock", "Pop", "Country"]}

Caveats

I think this is a simple yet flexible solution for general support of multi-valued headers. Thinking about where multi-valued headers might be used, there are some cases where conflicts may occur:

  • #ARTIST (currently single-valued): There are some artists containing a comma. However such artist names seem to be quite rare so this will probably not be an issue.
  • #AUDIOURL (currently single-valued): URLs may contain commas. It's not that common but it happens from time to time. In most cases it should be possible to use URL-escaping to encode the comma natively in the URL. In the remaining cases the double-comma escaping can be used. I think these cases are rare enough that it doesn't warrant a more complex escaping mechanism.
  • The syntax rules as proposed don't allow a single value to begin or end with a space. This is consistent with other headers.

Backwards Compatibility

This proposal of multi-valued headers would be largely backwards compatible. Older implementations would probably only recognize a single instance of the header. When using the comma-syntax, older implementation would recognize a single value that contains commas which would probably be a good compromise.

There may be cases where currently a single comma is being used in a value. If such a header would become multi-valued this could technically be a breaking change that would have to be addressed.

Additional Notes

I think the current set of headers (as they are defined) doesn't immediately benefit from this proposal. There is some unification for the existing multi-valued headers, but those work well as they are now. When writing this proposal I was mainly thinking about the following applications:

  • Clean representation of multiple artists
  • Multi-valued headers where a single value can be very long (such as URLs) where the comma-syntax could lead to very long lines.

I think this would be a quite elegant addition that would make the metadata-section of the file format a lot more flexible. Let me know what you think!

Metadata

Metadata

Assignees

Labels

Specificationaffects specification (spec.md)

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions