Skip to content

Conversation

tiger100256-hu
Copy link
Contributor

Details:

  • when there are conflict indices, the last should be valid, but in the CPU plugin, the ScatterNDUpadte operation is done in multi thread by tbb, so it can't be sure the operation is done according to the indices order

Tickets:

@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Aug 8, 2025
Signed-off-by: HU Yuan2 <[email protected]>
@tiger100256-hu tiger100256-hu force-pushed the huyuan/fix_scatter_nd_update branch 2 times, most recently from 582bdec to e6e9fb9 Compare August 8, 2025 07:26
@tiger100256-hu tiger100256-hu marked this pull request as ready for review August 11, 2025 04:53
@tiger100256-hu tiger100256-hu requested review from a team as code owners August 11, 2025 04:53
@yuxu42 yuxu42 requested review from Copilot and liubo-intel August 15, 2025 01:05
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a thread safety issue in the ScatterNDUpdate operation in the Intel CPU plugin where conflicting indices could produce incorrect results due to multi-threaded execution. The fix ensures that when multiple indices point to the same location, only the last update operation is applied, maintaining the expected behavior.

  • Introduces mutex-based synchronization to handle conflicting indices in multi-threaded execution
  • Adds tracking of the last valid update index to ensure correct ordering
  • Includes a new test case with conflicting indices to verify the fix

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
scatter_update.cpp Implements mutex-based synchronization and index tracking to handle conflicting indices correctly
scatter_ND_update.cpp Adds test case with conflicting indices to validate the fix

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@@ -1049,6 +1050,10 @@ void ScatterUpdate::scatterNDUpdate(const MemoryPtr& mem_data,
std::vector<size_t> srcBlockND = getBlockND(srcDataDim);

size_t k = indicesDim[indicesRank - 1];
// mutexVec and updatedIdxVec are used for idxValue conflicts,
// only the last idxValues of update vec is valid when there are idxValue conflicts,
std::vector<std::mutex> mutexVec(srcBlockND[0] / srcBlockND[k]);
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a vector of mutexes equal to the number of slices may cause performance degradation and excessive memory usage for large tensors. Consider using a smaller pool of mutexes with a hash-based mapping to reduce overhead.

Suggested change
std::vector<std::mutex> mutexVec(srcBlockND[0] / srcBlockND[k]);
constexpr size_t NUM_MUTEXES = 1024; // or 256, tune as needed
std::vector<std::mutex> mutexVec(NUM_MUTEXES);

Copilot uses AI. Check for mistakes.

cpu_memcpy(dstData + dstOffset, update + updateOffset, sizeToUpdate);
updatedIdxVec[sliceIdx] = tupleIdx;
}
mutexVec[sliceIdx].unlock();
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using manual mutex lock/unlock is error-prone and can lead to deadlocks if an exception occurs between lock and unlock. Use std::lock_guard or std::unique_lock for RAII-based exception-safe locking.

Suggested change
mutexVec[sliceIdx].unlock();
std::lock_guard<std::mutex> lock(mutexVec[sliceIdx]);
if (static_cast<int64_t>(tupleIdx) > updatedIdxVec[sliceIdx]) {
cpu_memcpy(dstData + dstOffset, update + updateOffset, sizeToUpdate);
updatedIdxVec[sliceIdx] = tupleIdx;
}

Copilot uses AI. Check for mistakes.

cpu_memcpy(dstData + dstOffset, update + updateOffset, sizeToUpdate);
updatedIdxVec[sliceIdx] = tupleIdx;
}
mutexVec[sliceIdx].unlock();
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual unlock should be replaced with RAII-based locking (std::lock_guard) to ensure the mutex is properly unlocked even if an exception occurs.

Suggested change
mutexVec[sliceIdx].unlock();
{
std::lock_guard<std::mutex> lock(mutexVec[sliceIdx]);
if (static_cast<int64_t>(tupleIdx) > updatedIdxVec[sliceIdx]) {
cpu_memcpy(dstData + dstOffset, update + updateOffset, sizeToUpdate);
updatedIdxVec[sliceIdx] = tupleIdx;
}
}

Copilot uses AI. Check for mistakes.

@liubo-intel
Copy link
Contributor

liubo-intel commented Aug 15, 2025

Hi, @tiger100256-hu : thanks for root cause this puzzling CI testcase issue. and the method implemented in this pr looks like could solve this kind of CI failure phenomenon.
But from the spec of this 'ScatterNDUpdate' operator, it seems this kind of output various is expected. so maybe we just need to change the failed CI testcase to use Non-conflicting Indices?(and from my understanding, add mutex in the operator implement may affect its inference performance)

@tiger100256-hu
Copy link
Contributor Author

@liubo-intel thanks very much, I forget to see this spec. from the spec, openvino ScatterNDUpdate don't support conflict indices so, the convert maybe is not right here

, I will check the paddle frontend code and find a way to fix this issue.

@tiger100256-hu
Copy link
Contributor Author

close this pr, it 's not a correct way to fix this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants