-
Notifications
You must be signed in to change notification settings - Fork 159
Description
Hi, there is a potential bug in utf8proc_map reachable with options UTF8PROC_CHARBOUND | UTF8PROC_DECOMPOSE.
This bug was reproduced on 5e52818.
Description
AddressSanitizer reports a global-buffer-overflow read at utf8proc.c:237 in unsafe_get_property(), specifically reading 2 bytes before the start of utf8proc_stage1table. The call stack shows this occurs during utf8proc_decompose_custom called from utf8proc_map_custom. The crash is triggered with a trivial input string "N" and the option combination UTF8PROC_DECOMPOSE | UTF8PROC_CHARBOUND (with UTF8PROC_NULLTERM).
Internally, this hits unsafe_get_property with the argument -1, while unsafe_get_property assumes the argument is in the range uc >= 0 && uc < 0x110000, thus reads out of bounds.
I noticed that in the oss-fuzz harness, this combination of arguments is never tested. I also couldn't find any documentation that explicitly mentioned interactions between the two options or indication that they could not be used together.
POC
The following testcase demonstrates the bug:
testcase.cpp
#include <cstdlib>
extern "C" {
#include "/fuzz/install/include/utf8proc.h"
}
int main(){
const char *s = "N"; // simple valid UTF-8, processed as NUL-terminated
utf8proc_uint8_t *dst = nullptr;
utf8proc_option_t opt = (utf8proc_option_t)(UTF8PROC_NULLTERM | UTF8PROC_CHARBOUND | UTF8PROC_DECOMPOSE);
// Triggers global OOB read in unsafe_get_property via utf8proc_decompose_custom
utf8proc_ssize_t r = utf8proc_map((const utf8proc_uint8_t*)s, 0, &dst, opt);
if (r >= 0) free(dst);
return 0;
}
stdout
=================================================================
==1==ERROR: AddressSanitizer: global-buffer-overflow on address 0x558c5244417e at pc 0x558c523d2e9c bp 0x7ffdab9d9b10 sp 0x7ffdab9d9b08
READ of size 2 at 0x558c5244417e thread T0
#0 0x558c523d2e9b in utf8proc_decompose_custom (/fuzz/test+0x109e9b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)
#1 0x558c523d420b in utf8proc_map_custom (/fuzz/test+0x10b20b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)
#2 0x558c523d04ea in main /fuzz/testcase.cpp:10:24
#3 0x7f5dd2082d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#4 0x7f5dd2082e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#5 0x558c522f52e4 in _start (/fuzz/test+0x2c2e4) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18)
0x558c5244417e is located 2 bytes before global variable 'utf8proc_stage1table' defined in '/fuzz/src/utf8proc.c' (0x558c52444180) of size 8704
0x558c5244417e is located 23166 bytes after global variable 'utf8proc_stage2table' defined in '/fuzz/src/utf8proc.c' (0x558c52427d00) of size 92672
SUMMARY: AddressSanitizer: global-buffer-overflow (/fuzz/test+0x109e9b) (BuildId: 04b67bec2b7b726c20ddba6fa36181034d7a5d18) in utf8proc_decompose_custom
Shadow bytes around the buggy address:
0x558c52443e80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x558c52443f00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x558c52443f80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x558c52444000: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x558c52444080: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
=>0x558c52444100: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]
0x558c52444180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x558c52444200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x558c52444280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x558c52444300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x558c52444380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==1==ABORTING
stderr
Steps to Reproduce
The crash was triaged with the following Dockerfile:
Dockerfile
# Ubuntu 22.04 with some packages pre-installed
FROM hgarrereyn/stitch_repro_base@sha256:3ae94cdb7bf2660f4941dc523fe48cd2555049f6fb7d17577f5efd32a40fdd2c
RUN git clone https://github.com/JuliaStrings/utf8proc /fuzz/src && \
cd /fuzz/src && \
git checkout 5e5281850f8603f7c97decca53c0d132f2e4826a && \
git submodule update --init --remote --recursive
ENV LD_LIBRARY_PATH=/fuzz/install/lib
ENV ASAN_OPTIONS=hard_rss_limit_mb=1024:detect_leaks=0
RUN echo '#!/bin/bash\nexec clang-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper && \
chmod +x /usr/local/bin/clang_wrapper && \
echo '#!/bin/bash\nexec clang++-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper++ && \
chmod +x /usr/local/bin/clang_wrapper++
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends cmake ninja-build make pkg-config && rm -rf /var/lib/apt/lists/*
# Configure, build, and install utf8proc as a static library
WORKDIR /fuzz
RUN cmake -S /fuzz/src -B build -G Ninja \
-DCMAKE_C_COMPILER=clang_wrapper \
-DCMAKE_CXX_COMPILER=clang_wrapper++ \
-DCMAKE_INSTALL_PREFIX=/fuzz/install \
-DBUILD_SHARED_LIBS=OFF \
-DUTF8PROC_ENABLE_TESTING=OFF \
-DUTF8PROC_INSTALL=ON
RUN cmake --build build --target install --config ReleaseBuild Command
clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lutf8proc && /fuzz/testReproduce
- Copy
Dockerfileandtestcase.cppinto a local folder. - Build the repro image:
docker build . -t repro --platform=linux/amd64- Compile and run the testcase in the image:
docker run \
-it --rm \
--platform linux/amd64 \
--mount type=bind,source="$(pwd)/testcase.cpp",target=/fuzz/testcase.cpp \
repro \
bash -c "clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lutf8proc && /fuzz/test"Additional Info
This testcase was discovered by STITCH, an autonomous fuzzing system. All reports are reviewed manually (by a human) before submission.