Skip to content

Conversation

danhoeflinger
Copy link
Contributor

@danhoeflinger danhoeflinger commented Aug 7, 2025

In this PR we switch from forwarding after a capture by value to a capture by reference and then pass through.

Forwarding doesn't make sense here as the type may no longer match.

branched off from
https://github.com/uxlfoundation/oneDPL/pull/2369/files#r2254402955

Even capturing by reference would result in a use-after-move-bug

Signed-off-by: Dan Hoeflinger <[email protected]>
@danhoeflinger danhoeflinger changed the title fix bug with forwarding captured by value policy Fix forwarded value captured policy Aug 7, 2025
Signed-off-by: Dan Hoeflinger <[email protected]>
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical bug in parallel set union operations where execution policies were being incorrectly forwarded within lambda captures. The fix prevents use-after-move errors by removing the forwarding of captured execution policies that would otherwise be moved multiple times.

  • Removed std::forward<_ExecutionPolicy>(__exec) calls from lambda captures in parallel set union operations
  • Changed all lambda captures to use the execution policy by value without forwarding
  • Applied the fix consistently across multiple conditional branches in the set union algorithm

@SergeyKopienko
Copy link
Contributor

SergeyKopienko commented Aug 8, 2025

@danhoeflinger I think we really had not quite correct code before your changes.

Let's take a look for example to the code:

template <class _IsVector, class _ExecutionPolicy, class _RandomAccessIterator1, class _RandomAccessIterator2,
          class _OutputIterator, class _Compare, class _SetUnionOp>
_OutputIterator
__parallel_set_union_op(__parallel_tag<_IsVector> __tag, _ExecutionPolicy&& __exec, _RandomAccessIterator1 __first1,
                        _RandomAccessIterator1 __last1, _RandomAccessIterator2 __first2, _RandomAccessIterator2 __last2,
                        _OutputIterator __result, _Compare __comp, _SetUnionOp __set_union_op)
{
    //...

    if (__left_bound_seq_1 == __last1)
    {
        //{1} < {2}: seq2 is wholly greater than seq1, so, do parallel copying seq1 and seq2
        __par_backend::__parallel_invoke(
            __backend_tag{}, ::std::forward<_ExecutionPolicy>(__exec),
            [=] {
                __internal::__pattern_walk2_brick(__tag, ::std::forward<_ExecutionPolicy>(__exec), __first1, __last1,
                                                  __result, __copy_range);
            },
            [=] {
                __internal::__pattern_walk2_brick(__tag, ::std::forward<_ExecutionPolicy>(__exec), __first2, __last2,
                                                  __result + __n1, __copy_range);
            });
        return __result + __n1 + __n2;
    }

    //...
}

As we may see here, we call __par_backend::__parallel_invoke and pass some arguments into it.
One of them - ::std::forward<_ExecutionPolicy>(__exec): we forward execution policy into this call.

But also we using __exec inside the other arguments too:

            [=] {
                __internal::__pattern_walk2_brick(__tag, ::std::forward<_ExecutionPolicy>(__exec), __first1, __last1,
                                                  __result, __copy_range);
            },

This mean that __exec state used in arguments of one __parallel_invoke call more then once and should not be forwarded (moved) in any of params: we order of params evaluation is not defined.

So I think this PR in it current state doesn't fix this problem too.

I propose to rewrite:

    if (__left_bound_seq_1 == __last1)
    {
        //{1} < {2}: seq2 is wholly greater than seq1, so, do parallel copying seq1 and seq2
        __par_backend::__parallel_invoke(
            __backend_tag{},
			__exec,		// We real error was here when we had std::forward<_ExecutionPolicy>(__exec)
            [=] {
                __internal::__pattern_walk2_brick(
					__tag,
					// As far as I understand this std::forward isn't real problem here at least while we have [=]
					std::forward<_ExecutionPolicy>(__exec),
					__first1, __last1, __result, __copy_range);
            },
            [=] {
                __internal::__pattern_walk2_brick(
					__tag, 
					// As far as I understand this std::forward isn't real problem here at least while we have [=]
                    std::forward<_ExecutionPolicy>(__exec),
					__first2, __last2, __result + __n1, __copy_range);
            });
        return __result + __n1 + __n2;
    }

Copy link
Contributor

@SergeyKopienko SergeyKopienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think additional changes required...
UPD: probably this PR doest't needed at all.

@akukanov
Copy link
Contributor

akukanov commented Aug 8, 2025

Folks, where have you found "multiple forwards" in the original code?
We remember that std::forward is just a type cast, do not we? No matter in which order the function call parameters are evaluated, both lambdas get copies of the original policy object and then forward these copies to the invoked bricks, while the original object is forwarded to __parallel_invoke. What do you think is problematic here?

@SergeyKopienko
Copy link
Contributor

Folks, where have you found "multiple forwards" in the original code? We remember that std::forward is just a type cast, do not we? No matter in which order the function call parameters are evaluated, both lambdas get copies of the original policy object and then forward these copies to the invoked bricks, while the original object is forwarded to __parallel_invoke. What do you think is problematic here?

Agreed !

@danhoeflinger
Copy link
Contributor Author

danhoeflinger commented Aug 8, 2025

Folks, where have you found "multiple forwards" in the original code? We remember that std::forward is just a type cast, do not we? No matter in which order the function call parameters are evaluated, both lambdas get copies of the original policy object and then forward these copies to the invoked bricks, while the original object is forwarded to __parallel_invoke. What do you think is problematic here?

I understand that forward is merely a type cast. I don't think I did a good job of stating the problem.

https://godbolt.org/z/75ejWxY6b

The real issue here (as I understand it) is that when capturing by value into the lambda, the type of __exec in the lambda becomes const _ExecutionPolicy, even if the incoming "universal" type was non-const rvalue _ExecutionPolicy&&. This means that within the lambda we are using std::forward<_ExecutionPolicy>(_exec) which will try to create an rvalue ref out of a const lvalue ref. The mismatch of constness in the forward causes a build issue (as can be seen in godbolt), but also, the mismatch in value category is a not good. In my opinion, forwarding here really doesnt make sense.

What I was trying to mention with the "multiple forwards" comment is what is shown in the last option in the godbolt. IF we were to try to "fix" this by capture by reference, the two usages in the lambda would fail with a use-after-move issue.

In this exploration, I think that the original version here in this PR is probably not as good as if we were to std::move within the lambdas, so I will adjust to do that.

If I'm missing something here with my explanation and godbolt, please let me know.

edit: I should mention that the branchname is misleading, and was not a good name for it as the bug is not a "use after move", but rather a poor usage of std::forward

Signed-off-by: Dan Hoeflinger <[email protected]>
@SergeyKopienko
Copy link
Contributor

SergeyKopienko commented Aug 8, 2025

Folks, where have you found "multiple forwards" in the original code? We remember that std::forward is just a type cast, do not we? No matter in which order the function call parameters are evaluated, both lambdas get copies of the original policy object and then forward these copies to the invoked bricks, while the original object is forwarded to __parallel_invoke. What do you think is problematic here?

So the problem really has place.
And e reproduced it:

@SergeyKopienko SergeyKopienko dismissed their stale review August 8, 2025 15:18

some fixes really needed.

@SergeyKopienko SergeyKopienko self-requested a review August 8, 2025 15:18
@danhoeflinger
Copy link
Contributor Author

As mentioned in offline discussion with @SergeyKopienko, the reason this did not appear as a build error in our existing test suite is because the nature of our "canned" host policies like seq, or par_unseq are global constexpr instances. This means that these are passed / around as const lvalue ref. There is no build error for dropping const-ness in the forward.

If someone copies these policies to a non-const instance before calling an API, we would see the issue (before the fixes in this PR).

@danhoeflinger
Copy link
Contributor Author

Updated after offline discussion. Changing from moving after value capture to pass directly after capture by reference.

@SergeyKopienko SergeyKopienko self-requested a review August 13, 2025 13:50
Copy link
Contributor

@SergeyKopienko SergeyKopienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danhoeflinger danhoeflinger merged commit d0fe826 into main Aug 13, 2025
19 checks passed
@danhoeflinger danhoeflinger deleted the dev/dhoeflin/use_after_move_fix branch August 13, 2025 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants