Skip to content

Matroska parser does not check return values of io_->read() in decodeBlock() #9281

@MarkLee131

Description

@MarkLee131

Describe the bug

In src/matroskavideo.cpp, the function decodeBlock() calls io_->read() in four places without checking the return value. When parsing a truncated MKV file, these reads may return fewer bytes than requested. The buffer then contains leftover data from the previous iteration, and the parser computes tag IDs and sizes from that stale data.

The first read in decodeBlock() (line 636) correctly checks for EOF. The four subsequent reads do not:

// line 644-645: tag ID continuation bytes
if (block_size > 0)
    io_->read(buf + 1, block_size - 1);   // not checked

// line 662: size field first byte
io_->read(buf, 1);                         // not checked

// line 665-666: size field continuation bytes
if (block_size > 0)
    io_->read(buf + 1, block_size - 1);   // not checked

// line 686: element data
io_->read(buf2.data(), size);              // not checked

This is similar to the issue fixed in PR #3005, where unchecked io_->read() calls in the ASF parser were replaced with readOrThrow().

To Reproduce

  1. Take any valid MKV file and truncate it mid-stream (e.g., head -c 100 valid.mkv > truncated.mkv).
  2. Run exiv2 -pa truncated.mkv.
  3. Observed on main branch, current HEAD.

The parser does not crash, but it may extract incorrect metadata from the truncated file because it processes stale buffer contents.

Expected behavior

Each io_->read() call should either use readOrThrow() or check the return value and handle short reads. This would make the Matroska parser consistent with the ASF parser after PR #3005.

Desktop (please complete the following information):

  • OS and version: macOS (Darwin 25.3.0, arm64)
  • Exiv2 version and source: main branch, built from source
  • Compiler and version: Clang 22.1.1 (homebrew llvm)
  • Compilation mode and/or compiler flags: Debug, -fsanitize=address

Additional context

The fix is to replace the four unchecked io_->read() calls with io_->readOrThrow(), matching the pattern from PR #3005. I can submit a PR if helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions