Skip to content

Memory leaks when decoding a corrupted multiple LZMA archives

Moderate
ulikunitz published GHSA-jc7w-c686-c4v9 Aug 28, 2025

Package

gomod github.com/ulikunitz/xz/lzma (Go)

Affected versions

<= v0.5.13

Patched versions

v0.5.15

Description

Summary

It is possible to put data in front of an LZMA-encoded byte stream without detecting the situation while reading the header. This can lead to increased memory consumption because the current implementation allocates the full decoding buffer directly after reading the header. The LZMA header doesn't include a magic number or has a checksum to detect such an issue according to the specification.

Note that the code recognizes the issue later while reading the stream, but at this time the memory allocation has already been done.

Mitigations

The release v0.5.15 includes following mitigations:

  • The ReaderConfig DictCap field is now interpreted as a limit for the dictionary size.
  • The default is 2 Gigabytes - 1 byte (2^31-1 bytes).
  • Users can check with the [Reader.Header] method what the actual values are in their LZMA files and set a smaller limit using ReaderConfig.
  • The dictionary size will not exceed the larger of the file size and the minimum dictionary size. This is another measure to prevent huge memory allocations for the dictionary.
  • The code supports stream sizes only up to a pebibyte (1024^5).

Note that the original v0.5.14 version had a compiler error for 32 bit platforms, which has been fixed by v0.5.15.

Methods affected

Only software that uses lzma.NewReader or lzma.ReaderConfig.NewReader is affected. There is no issue for software using the xz functionality.

I thank @GregoryBuligin for his report, which is provided below.

Summary

When unpacking a large number of LZMA archives, even in a single goroutine, if the first byte of the archive file is 0 (a zero byte added to the beginning), an error writeMatch: distance out of range occurs. Memory consumption spikes sharply, and the GC clearly cannot handle this situation.

Details

Judging by the error writeMatch: distance out of range, the problems occur in the code around this function.

return errors.New("writeMatch: distance out of range")

PoC

Run a function similar to this one in 1 or several goroutines on a multitude of LZMA archives that have a 0 (a zero byte) added to the beginning.

const ProjectLocalPath = "some/path"
const TmpDir = "tmp"

func UnpackLZMA(lzmaFile string) error {
	file, err := os.Open(lzmaFile)
	if err != nil {
		return err
	}
	defer file.Close()

	reader, err := lzma.NewReader(bufio.NewReader(file))
	if err != nil {
		return err
	}

	tmpFile, err := os.CreateTemp(TmpDir, TmpLZMAPrefix)
	if err != nil {
		return err
	}
	defer func() {
		tmpFile.Close()
		_ = os.Remove(tmpFile.Name())
	}()

	sha256Hasher := sha256.New()
	multiWriter := io.MultiWriter(tmpFile, sha256Hasher)

	if _, err = io.Copy(multiWriter, reader); err != nil {
		return err
	}

	unpackHash := hex.EncodeToString(sha256Hasher.Sum(nil))
	unpackDir := filepath.Join(
		ProjectLocalPath, unpackHash[:2],
	)
	_ = os.MkdirAll(unpackDir, DirPerm)

	unpackPath := filepath.Join(unpackDir, unpackHash)

	return os.Rename(tmpFile.Name(), unpackPath)
}

Impact

Servers with a small amount of RAM that download and unpack a large number of unverified LZMA archives

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
Low

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L

CVE ID

CVE-2025-58058

Weaknesses

Allocation of Resources Without Limits or Throttling

The product allocates a reusable resource or group of resources on behalf of an actor without imposing any restrictions on the size or number of resources that can be allocated, in violation of the intended security policy for that actor. Learn more on MITRE.