Add systematic error testing framework (AT-101) #15

devin-ai-integration · 2025-09-29T20:19:22Z

Make sure to read the contributing guidelines before submitting a PR

Summary

This PR addresses JIRA ticket AT-101 by expanding test coverage for error handling and edge cases. It creates a systematic error testing framework with three main components:

Memory exhaustion testing (test-memory-exhaustion.cpp) - Tests controlled OOM scenarios including small/medium/large allocations, buffer overflow, and recovery mechanisms
Invalid input validation (test-invalid-inputs.cpp) - Tests malformed tensor shapes, type mismatches, dimension limits, and parameter validation
Backend operations error scenarios - Extends existing test-backend-ops.cpp with error injection capabilities

Key Changes

New Test Files (377 lines total)

tests/test-memory-exhaustion.cpp - 6 systematic memory pressure scenarios with controlled failure injection
tests/test-invalid-inputs.cpp - 6 input validation scenarios testing tensor creation edge cases

Error Injection Infrastructure

ggml/src/ggml-alloc.c - Adds environment variable-based error injection:
- GGML_ALLOC_FAIL_THRESHOLD - Fail allocations >= specified byte size
- GGML_ALLOC_FAIL_COUNT - Fail after specified number of allocations
- Non-intrusive design that only activates when env vars are set

Extended Backend Testing

tests/test-backend-ops.cpp - Adds GGML_TEST_ERRORS environment variable to enable error injection test cases
tests/CMakeLists.txt - Registers new tests using existing llama_build_and_test pattern

Testing Results

All 38 existing tests continue to pass (0 regressions)
New tests pass all 12 combined scenarios (6 memory + 6 input validation)
Error injection properly integrates with existing tensor operations

Important Review Areas

🔴 Critical: The error injection mechanism modifies production code (ggml-alloc.c). While gated by environment variables, reviewers should evaluate whether this approach is acceptable or if a different testing strategy should be used.

🟡 Thread Safety: The error injection uses a static counter variable that could have race conditions in multi-threaded scenarios.

🟡 Test Isolation: Tests rely on global environment variables - verify that parallel test execution won't cause interference.

🟢 Integration: CMake integration follows existing patterns and all tests are properly registered.

Human Review Checklist

Evaluate if error injection in production code (ggml-alloc.c) is acceptable architecture
Review thread safety of static alloc_count variable in ggml_alloc_should_fail()
Verify test scenarios actually test meaningful error conditions vs superficial checks
Confirm environment variable cleanup prevents test interference
Check if new test files follow existing code style and patterns

Link to Devin run: https://app.devin.ai/sessions/55de95f979c14d71aa4b6b3125a2ccf1
Requested by: Alex Peng (@alexpeng-cognition)

JIRA: AT-101

- Create test-memory-exhaustion.cpp for controlled OOM scenarios * Tests small/medium/large allocation failures * Tests many allocations and buffer overflow scenarios * Includes recovery testing to verify normal operation * All 6 test scenarios passing - Create test-invalid-inputs.cpp for malformed tensor validation * Tests zero dimensions, shape mismatches, type incompatibility * Tests dimension limits and parameter validation * All 6 test scenarios passing - Extend test-backend-ops.cpp with error injection test cases * Adds GGML_TEST_ERRORS environment variable control * Enables testing allocation failures during tensor operations * Integrates seamlessly with existing test infrastructure - Add environment variable-based error injection in ggml-alloc.c * GGML_ALLOC_FAIL_THRESHOLD: Fail allocations >= specified size * GGML_ALLOC_FAIL_COUNT: Fail after specified number of allocations * Non-intrusive design that doesn't affect production code paths * Proper error logging for debugging - Update CMakeLists.txt with new test targets * Uses existing llama_build_and_test pattern * Tests run as part of main test suite * No regressions introduced (38/38 tests passing) Addresses JIRA ticket AT-101 for systematic error scenario testing. Tests follow handcrafted error pattern from test-gguf.cpp with clear scenario naming and expected outcomes. Co-Authored-By: Alex Peng <[email protected]>

devin-ai-integration · 2025-09-29T20:19:24Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

- Add portable setenv/unsetenv macros for Windows (_putenv_s) - Replace all setenv/unsetenv calls with portable versions - Fix trailing whitespace issues for editorconfig compliance This fixes Windows and macOS build failures caused by POSIX-only functions. Co-Authored-By: Alex Peng <[email protected]>

github-actions bot added testing ggml labels Sep 29, 2025

jakexcosme mentioned this pull request Oct 22, 2025

Eval bug: HIP gfx908 (MI100) cublass error when prompt is too long. #154

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add systematic error testing framework (AT-101) #15

Add systematic error testing framework (AT-101) #15

Uh oh!

devin-ai-integration bot commented Sep 29, 2025

Uh oh!

devin-ai-integration bot commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Add systematic error testing framework (AT-101) #15

Are you sure you want to change the base?

Add systematic error testing framework (AT-101) #15

Uh oh!

Conversation

devin-ai-integration bot commented Sep 29, 2025

Summary

Key Changes

New Test Files (377 lines total)

Error Injection Infrastructure

Extended Backend Testing

Testing Results

Important Review Areas

Human Review Checklist

Uh oh!

devin-ai-integration bot commented Sep 29, 2025

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants