Skip to content

Global OOB read in utf8proc when using DECOMPOSE with CHARBOUND in utf8proc_map #310

@hgarrereyn

Description

@hgarrereyn

Hi, there is a potential bug in utf8proc_map reachable with options UTF8PROC_CHARBOUND | UTF8PROC_DECOMPOSE.

This bug was reproduced on 5e52818.

Description

AddressSanitizer reports a global-buffer-overflow read at utf8proc.c:237 in unsafe_get_property(), specifically reading 2 bytes before the start of utf8proc_stage1table. The call stack shows this occurs during utf8proc_decompose_custom called from utf8proc_map_custom. The crash is triggered with a trivial input string "N" and the option combination UTF8PROC_DECOMPOSE | UTF8PROC_CHARBOUND (with UTF8PROC_NULLTERM).

Internally, this hits unsafe_get_property with the argument -1, while unsafe_get_property assumes the argument is in the range uc >= 0 && uc < 0x110000, thus reads out of bounds.

I noticed that in the oss-fuzz harness, this combination of arguments is never tested. I also couldn't find any documentation that explicitly mentioned interactions between the two options or indication that they could not be used together.

POC

The following testcase demonstrates the bug:

testcase.cpp

#include <cstdlib>
extern "C" {
#include "/fuzz/install/include/utf8proc.h"
}
int main(){
  const char *s = "N"; // simple valid UTF-8, processed as NUL-terminated
  utf8proc_uint8_t *dst = nullptr;
  utf8proc_option_t opt = (utf8proc_option_t)(UTF8PROC_NULLTERM | UTF8PROC_CHARBOUND | UTF8PROC_DECOMPOSE);
  // Triggers global OOB read in unsafe_get_property via utf8proc_decompose_custom
  utf8proc_ssize_t r = utf8proc_map((const utf8proc_uint8_t*)s, 0, &dst, opt);
  if (r >= 0) free(dst);
  return 0;
}

stdout

=================================================================
==1==ERROR: AddressSanitizer: global-buffer-overflow on address 0x558c5244417e at pc 0x558c523d2e9c bp 0x7ffdab9d9b10 sp 0x7ffdab9d9b08
READ of size 2 at 0x558c5244417e thread T0
    #0 0x558c523d2e9b in utf8proc_decompose_custom (/fuzz/test+0x109e9b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)
    #1 0x558c523d420b in utf8proc_map_custom (/fuzz/test+0x10b20b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)
    #2 0x558c523d04ea in main /fuzz/testcase.cpp:10:24
    #3 0x7f5dd2082d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #4 0x7f5dd2082e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #5 0x558c522f52e4 in _start (/fuzz/test+0x2c2e4) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)

0x558c5244417e is located 2 bytes before global variable 'utf8proc_stage1table' defined in '/fuzz/src/utf8proc.c' (0x558c52444180) of size 8704
0x558c5244417e is located 23166 bytes after global variable 'utf8proc_stage2table' defined in '/fuzz/src/utf8proc.c' (0x558c52427d00) of size 92672
SUMMARY: AddressSanitizer: global-buffer-overflow (/fuzz/test+0x109e9b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18) in utf8proc_decompose_custom
Shadow bytes around the buggy address:
  0x558c52443e80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x558c52443f00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x558c52443f80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x558c52444000: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x558c52444080: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
=>0x558c52444100: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]
  0x558c52444180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x558c52444200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x558c52444280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x558c52444300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x558c52444380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1==ABORTING

stderr


Steps to Reproduce

The crash was triaged with the following Dockerfile:

Dockerfile

# Ubuntu 22.04 with some packages pre-installed
FROM hgarrereyn/stitch_repro_base@sha256:3ae94cdb7bf2660f4941dc523fe48cd2555049f6fb7d17577f5efd32a40fdd2c

RUN git clone https://github.com/JuliaStrings/utf8proc /fuzz/src && \
    cd /fuzz/src && \
    git checkout 5e5281850f8603f7c97decca53c0d132f2e4826a && \
    git submodule update --init --remote --recursive

ENV LD_LIBRARY_PATH=/fuzz/install/lib
ENV ASAN_OPTIONS=hard_rss_limit_mb=1024:detect_leaks=0

RUN echo '#!/bin/bash\nexec clang-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper && \
    chmod +x /usr/local/bin/clang_wrapper && \
    echo '#!/bin/bash\nexec clang++-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper++ && \
    chmod +x /usr/local/bin/clang_wrapper++

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends cmake ninja-build make pkg-config && rm -rf /var/lib/apt/lists/*

# Configure, build, and install utf8proc as a static library
WORKDIR /fuzz
RUN cmake -S /fuzz/src -B build -G Ninja \
    -DCMAKE_C_COMPILER=clang_wrapper \
    -DCMAKE_CXX_COMPILER=clang_wrapper++ \
    -DCMAKE_INSTALL_PREFIX=/fuzz/install \
    -DBUILD_SHARED_LIBS=OFF \
    -DUTF8PROC_ENABLE_TESTING=OFF \
    -DUTF8PROC_INSTALL=ON
RUN cmake --build build --target install --config Release

Build Command

clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lutf8proc && /fuzz/test

Reproduce

  1. Copy Dockerfile and testcase.cpp into a local folder.
  2. Build the repro image:
docker build . -t repro --platform=linux/amd64
  1. Compile and run the testcase in the image:
docker run \
    -it --rm \
    --platform linux/amd64 \
    --mount type=bind,source="$(pwd)/testcase.cpp",target=/fuzz/testcase.cpp \
    repro \
    bash -c "clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lutf8proc && /fuzz/test"


Additional Info

This testcase was discovered by STITCH, an autonomous fuzzing system. All reports are reviewed manually (by a human) before submission.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions