Skip to content

Conversation

@MengAiDev
Copy link

This pull request addresses a critical null pointer bug on the Windows platform within the ggml-sycl module. It also introduces improvements to memory operations for better stability and performance.

Key Changes:

  1. Null Pointer Check for Tensor Data:

    • Added a null pointer check for tensor data on the Windows platform, aligning it with similar checks on Linux. This prevents potential crashes caused by uninitialized or invalid tensor pointers.
    • Modified the memcpy operation logic to include this safety mechanism.
  2. Improved Buffer Clear Function:

    • Implemented a proper memset operation in the buffer clear function to ensure memory is correctly cleared before reuse, avoiding undefined behavior.
  3. New Helper Function: get_tensor:

    • Introduced a new function ggml_backend_sycl_buffer_get_tensor with an additional null pointer check to handle edge cases where tensor data might be null during retrieval.
  4. Enhanced Error Handling and Logging:

    • Improved error messages by including detailed file and line information when exceptions are caught.
    • Ensured consistent exit handling across SYCL operations for easier debugging.
  5. Code Refactoring:

    • Streamlined exception handling blocks to reduce redundancy and improve readability.
    • Updated logging mechanisms to provide more context during runtime operations, aiding in troubleshooting.

These changes collectively enhance the robustness of the ggml-sycl implementation, particularly on the Windows platform, while also improving overall code maintainability and debugging capabilities.

MengAiDev added 2 commits June 20, 2025 13:00
- Replace generic error message with more specific information
- Use %zu format specifier for size_t type
- Add device ID to the error message
- Simplify memory allocation code
…perations

- Add null pointer check for tensor data on Windows platform
- Implement proper memset operation in buffer clear function
- Add get_tensor function and include null pointer check
- Improve error handling and logging in SYCL operations
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jun 20, 2025
GGML_SYCL_DEBUG("[SYCL] call %s", __func__);
}

static void ggml_backend_sycl_buffer_get_tensor(ggml_backend_buffer_t buffer,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function already exists

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author has "AI dev" in their name. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm just curious to see if I would get an answer :D

Comment on lines +437 to +440
<< ", line:" << __LINE__ << std::endl;
std::exit(1);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense and won't compile

Comment on lines -1241 to +1248
SYCL_CHECK(
CHECK_TRY_ERROR(ptr = (void *)sycl::malloc_device(
look_ahead_size, *qptr)));
void * ptr = sycl::malloc_device(look_ahead_size, *qptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to remove these checks?

Comment on lines +403 to +405
if (tensor->data == nullptr) {
GGML_ABORT("Error: Tensor data pointer is null.\n");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backends assume tensor->data is valid at this point so this does not seem needed.

@MengAiDev MengAiDev closed this Jun 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants