Decoders with custom offset

Hi!

first of all: thanks a lot for bringing Elm into the world! This language kept me motivated digging more and more into functional programming, whereas with other languages I felt overwhelmed and discouraged pretty quickly.


### Short
Recently I've been writing a parser for midi files in elm, using elm/bytes. Overall it worked pretty nicely. But there was one thing I was missing: writing a custom decoder where an offset could be provided.

### Issue
Parts of a midi file contain a list-like structure. Items are (potentially) compressed in a way that one byte needs to be read in order to know how to process this same and the following bytes. Depending on the most significant bit, this byte might contain some meta-information. If it doesn't this means the current list-item has the same meta-information as the latest item, so it is just dropped here. Implying that the byte that was just read is not to be read as just a single byte, but in various ways, depending on the previous meta-information. This leads to a situation where some kind of lookahead would be useful. Something that goes like: Ok the byte is of this structure, so keep it and read it as two nibbles, which define how to read the following bytes. Or: Oh the byte is of this other structure, so just forget about it and use the most recent meta-information and continue like normal, BUT start where the byte we just read started, because it is not the meta-info byte, but already part of the data. So we are basically one byte off.

### Solution (possibly)
I think in this case it would be nice to just set back the offset, so we effectively forget about the byte we just decoded. Because there is no way to reset the offset when using `andThen` I carried this already read byte around and needed to provided it to the following decoders, where I had to prepend it conditionally. This made the code harder to read, understand and reuse.

In the source code of elm/bytes `andThen` and `mapN` use an offset internally. I guess exposing the data constructor `Decoder (Bytes -> Int -> (Int, a))`, instead of just the type constructor `Decoder a` would be all that is needed to be able to build custom `map` / `andThen` decoders for doing this kind of lookahead decoding.

### Illustration

Maybe my explanation was a little confusing, so this code hopefully makes it easier to understand

```elm
Bytes.Decode.unsignedInt8
    |> Bytes.Decode.andThen
        (\currentPotentialStatusByte ->
            let
                isCompressed =
                    currentPotentialStatusByte < 128

                currentStatusByte =
                    if isCompressed then
                        previousStatusByte

                    else
                        currentPotentialStatusByte

                ( mEventName, channel ) =
                    statusByteToNibbles currentStatusByte

                readFirstByte =
                    if isCompressed then
                        Bytes.Decode.succeed currentPotentialStatusByte

                    else
                        Bytes.Decode.unsignedInt8
...
            readFirstByte |> Bytes.Decode.andThen preReadVariableLengthValueDecoder |> Bytes.Decode.andThen Bytes.Decode.string |> Bytes.Decode.map ((++) "System Exclusive Begin Event" >> NotYetSupportedEvent)
```

and then I need to carry around `readFirstByte` and map all the following decoders. But I think this would be nicer:

```elm
Bytes.Decode.Decoder
    (\bites offset ->
        let
            (Bytes.Decode.Decoder uint8Decode) =
                Bytes.Decode.unsignedInt8

            (currentPotentialStatusByte, newOffset) =
                uint8Decode bites offset

            isCompressed =
                currentPotentialStatusByte < 128

            currentStatusByte =
                if isCompressed then
                    previousStatusByte

                else
                    currentPotentialStatusByte

            ( mEventName, channel ) =
                statusByteToNibbles currentStatusByte

            nextOffset =
                if isCompressed then
                    offset

                else
                    newOffset
...
        withOffset nextOffset readVariableLengthValueDecoder |> Bytes.Decode.andThen Bytes.Decode.string |> Bytes.Decode.map ((++) "System Exclusive Begin Event" >> NotYetSupportedEvent)
```

Where `withOffset` would be something like

```elm
withOffset : Int -> Bytes.Decode.Decoder a
withOffset offset (Bytes.Decode.Decoder decode) =
	Decoder <| \bites _ -> decode bites offset
```

So here I could just use `readVariableLengthValueDecoder`, which is also used in other places. And I would not need to create a `preReadVariableLengthValueDecoder` - which does the same thing but either reads the first byte, or doesn't depending on the given argument

edit: during the past days I watched a bunch of elm talks (mostly held by Richard Feldman). I now understand much better why the type `Decoder` is opaque. Still I think having a way to adjust the offset would be very nice. Maybe through adding a utility function like `withOffset`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoders with custom offset #22

Short

Issue

Solution (possibly)

Illustration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decoders with custom offset #22

Description

Short

Issue

Solution (possibly)

Illustration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions