Fix segmentation fault in NLLLoss kernel #2111

yucai-intel · 2025-09-26T01:01:15Z

Fixed the following issues found by test/test_nn.py::TestNNDeviceTypeXPU::test_nll_loss_large_tensor_reduction_mean_xpu and test_nll_loss_large_tensor_reduction_sum_xpu

Segmentation faults caused by pointer type conversion errors that result in invalid memory addresses.
Kernel call errors caused by incorrect judgment conditions.

yucai-intel · 2025-09-26T01:02:30Z

issue link #2008

yucai-intel · 2025-09-26T01:04:33Z

Copilot

Pull Request Overview

This PR fixes segmentation faults and kernel call errors in the NLLLoss kernel implementation for XPU devices. The changes refactor the kernel functors to use safer memory access patterns and more consistent parameter ordering.

Key changes include:

Complete rewrite of kernel functors with improved memory safety and bounds checking
Simplified function signatures with reordered parameters for better consistency
Addition of proper index validation and overflow protection

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
src/ATen/native/xpu/sycl/LossNLLKernel.h	Updated function signatures to reorder parameters and use consistent naming
src/ATen/native/xpu/sycl/LossNLLKernel.cpp	Major refactor of kernel implementations with improved memory safety and bounds checking
src/ATen/native/xpu/sycl/KernelUtils.h	Added utility constants and functions for kernel execution
src/ATen/native/xpu/LossNLL.cpp	Updated function calls to match new kernel signatures

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-30T01:38:46Z

src/ATen/native/xpu/sycl/LossNLLKernel.cpp

+      const index_t t = target[i];
+      if (t != ignore_index) {
+        CHECK_INDEX_IN_CLASS(t, n_classes);
+        const bwd_index_t index = static_cast<bwd_index_t>(i) * ndim + t;


The index calculation static_cast<bwd_index_t>(i) * ndim + t could potentially overflow for large tensors. Consider adding overflow checks or using safer arithmetic operations.

Copilot · 2025-09-30T01:38:46Z

src/ATen/native/xpu/sycl/LossNLLKernel.cpp

+                int64_t local_size =
+                    syclMaxWorkGroupSize<NllLossForwardReduce2DKernel>();


The local_size is calculated but then overridden by nthreads on line 513-514. This could lead to inefficient kernel launches if nthreads doesn't match optimal work group sizes.

Copilot · 2025-09-30T01:38:47Z

src/ATen/native/xpu/sycl/LossNLLKernel.cpp

+                    nll_loss_threads(input.size(0)),
+                    nll_loss_threads(input.size(0)),


The function nll_loss_threads() is called twice with the same argument. Consider storing the result in a variable to avoid redundant computation.

yucai-intel · 2025-10-15T06:45:15Z

Pref

yucai-intel added 4 commits September 26, 2025 08:52

Update LossNLLKernel.cpp

72ec47a

Update LossNLLKernel.h

235618f

Update LossNLL.cpp

cdb78be

Update KernelUtils.h

f87f1e5

yucai-intel mentioned this pull request Sep 26, 2025

reduction got accuracy issue on large tensor cases #2008

Open

CuiYifeng requested a review from Copilot September 30, 2025 01:37

Copilot AI reviewed Sep 30, 2025

View reviewed changes

yucai-intel added 2 commits October 7, 2025 18:42

Update LossNLLKernel.cpp

5b4ce0e

Update KernelUtils.h

9006cfe

yucai-intel and others added 2 commits October 15, 2025 14:50

Update LossNLLKernel.cpp

adc4152

Merge branch 'main' into yucai/nll/fix

299e8b9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix segmentation fault in NLLLoss kernel #2111

Fix segmentation fault in NLLLoss kernel #2111

Uh oh!

yucai-intel commented Sep 26, 2025

Uh oh!

yucai-intel commented Sep 26, 2025

Uh oh!

yucai-intel commented Sep 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 30, 2025

Uh oh!

Copilot AI Sep 30, 2025

Uh oh!

Copilot AI Sep 30, 2025

Uh oh!

yucai-intel commented Oct 15, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		int64_t local_size =
		syclMaxWorkGroupSize<NllLossForwardReduce2DKernel>();

		nll_loss_threads(input.size(0)),
		nll_loss_threads(input.size(0)),

Fix segmentation fault in NLLLoss kernel #2111

Are you sure you want to change the base?

Fix segmentation fault in NLLLoss kernel #2111

Uh oh!

Conversation

yucai-intel commented Sep 26, 2025

Uh oh!

yucai-intel commented Sep 26, 2025

Uh oh!

yucai-intel commented Sep 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

yucai-intel commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yucai-intel commented Oct 15, 2025 •

edited

Loading