Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,81 @@ To learn more about model quantization, [read this documentation](tools/quantize
- [Performance troubleshooting](docs/development/token_generation_performance_tips.md)
- [GGML tips & tricks](https://github.com/ggml-org/llama.cpp/wiki/GGML-Tips-&-Tricks)

#### Testing

##### Memory Leak Testing

The repository includes comprehensive memory leak regression tests to ensure proper memory management across various lifecycle scenarios. These tests go beyond the existing AddressSanitizer (ASan) integration by providing dedicated leak detection test suites.

**Running with AddressSanitizer:**

The primary memory leak detection mechanism uses AddressSanitizer, which is configured as a build option:

```bash
# Build with AddressSanitizer enabled
cmake -B build -DLLAMA_SANITIZE_ADDRESS=ON -DCMAKE_BUILD_TYPE=Debug
cmake --build build

# Run the memory leak regression tests
cd build
ctest -R test-memory-leaks --output-on-failure

# Or run directly
./bin/test-memory-leaks
```

Other available sanitizers:
- `LLAMA_SANITIZE_THREAD=ON` - Detects data races (note: runs without OpenMP)
- `LLAMA_SANITIZE_UNDEFINED=ON` - Detects undefined behavior

**Running with Valgrind:**

Optional Valgrind integration is available for additional leak checking:

```bash
# Build the tests (Valgrind target is automatically configured if valgrind is installed)
cmake -B build
cmake --build build

# Run memory leak tests with Valgrind
cd build
make test-valgrind
```

The Valgrind target runs with comprehensive leak detection flags:
- `--leak-check=full` - Detailed leak information
- `--show-leak-kinds=all` - Reports all leak types
- `--track-origins=yes` - Tracks origin of uninitialized values

**Test Coverage:**

The `test-memory-leaks.cpp` suite includes 10 comprehensive tests covering:

1. **Backend initialization cycles** - Repeated `llama_backend_init()` / `llama_backend_free()` cycles
2. **Model load/unload cycles** - Repeated model loading and cleanup (10 iterations)
3. **Context lifecycle** - Context creation and destruction patterns (10 iterations)
4. **Multiple contexts per model** - Creating multiple contexts from the same model (5 contexts)
5. **Sampler lifecycle** - Sampler creation, chain operations, and cleanup
6. **Batch operations** - Batch allocation and deallocation patterns
7. **KV cache clearing** - Memory clearing operations on contexts
8. **Threaded contexts** - Concurrent model usage with multiple threads
9. **Model load cancellation** - Cleanup when canceling model loading mid-process
10. **Error condition cleanup** - Proper cleanup when operations fail (e.g., invalid model path)

All tests follow proper cleanup order: sampler → context → model → backend.

**Environment Variables:**

- `LLAMACPP_TEST_MODELFILE` - Path to test model file (required for running tests)

**Continuous Integration:**

The GitHub Actions CI automatically runs all tests with all three sanitizers (ADDRESS, THREAD, UNDEFINED) on every pull request to catch memory issues before they reach production.

**Known Issues:**

- `test-opt.cpp` is currently disabled with `LLAMA_SANITIZE_ADDRESS` due to a known memory leak in `ggml_opt_alloc()` called within a loop (see `tests/test-opt.cpp:300`)

#### Seminal papers and background on the models

If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
Expand Down
21 changes: 21 additions & 0 deletions tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ llama_build_and_test(test-backend-ops.cpp)

llama_build_and_test(test-model-load-cancel.cpp LABEL "model")
llama_build_and_test(test-autorelease.cpp LABEL "model")
llama_build_and_test(test-memory-leaks.cpp LABEL "model")

if (NOT GGML_BACKEND_DL)
# these tests use the backends directly and cannot be built with dynamic loading
Expand All @@ -219,3 +220,23 @@ target_link_libraries(${LLAMA_TEST_NAME} PRIVATE mtmd)
get_filename_component(TEST_TARGET test-c.c NAME_WE)
add_executable(${TEST_TARGET} test-c.c)
target_link_libraries(${TEST_TARGET} PRIVATE llama)

# Optional Valgrind target for memory leak checking
find_program(VALGRIND_EXECUTABLE valgrind)
if(VALGRIND_EXECUTABLE)
add_custom_target(test-valgrind
COMMAND ${VALGRIND_EXECUTABLE}
--leak-check=full
--show-leak-kinds=all
--track-origins=yes
--error-exitcode=1
${CMAKE_CURRENT_BINARY_DIR}/test-memory-leaks
DEPENDS test-memory-leaks
COMMENT "Running memory leak tests with Valgrind"
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
)
message(STATUS "Valgrind found: ${VALGRIND_EXECUTABLE}")
message(STATUS "Run 'make test-valgrind' to check for memory leaks with Valgrind")
else()
message(STATUS "Valgrind not found - install it for additional leak checking")
endif()
Loading
Loading