- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.2k
Description
Intro
Current pybind11 functional type_caster support invoking callbacks passed from Python in async way, i.e. from multiple C++ threads. It implements that by holding GIL while functor is being executed according to following essential code from type_caster::load() that initializes value member:
...
		value = [func](Args... args) -> Return {
			gil_scoped_acquire acq;
			object retval(func(std::forward<Args>(args)...));
			/* Visual studio 2015 parser issue: need parentheses around this expression */
			return (retval.template cast<Return>());
		};Notice the sentence gil_scoped_acquire acq; that captures and releases GIL in RAII fashion.
Problem
Problem with code above is that destruction of value (and captured Python functor func) happens after GIL has been released. Then, if functor func is for example Python lambda that captures some variables, these variables are being freed (reference counter decremented) when GIL is no longer held.
Notice that all this process of functor invoke and destruction can execute in some worker C++ thread and that leads to UB (immediate terminate in my experience).
Problem isn't arising If func is pure function or a stateless lambda.
Solution
I've made a workaround to this issue by replacing the code above with the following:
...
		// dynamically allocated lambda that actually invokes passed functor
		auto f = new auto([func](Args... args) -> Return {
			object retval(func(std::forward<Args>(args)...));
			/* Visual studio 2015 parser issue: need parentheses around this expression */
			return (retval.template cast<Return>());
		});
		if(!f) return false;
		// ensure GIL is released AFTER functor destructor is called
		value = [f](Args... args) -> Return {
			gil_scoped_acquire acq;
			(*f)(std::forward<Args>(args)...);
			delete f;
		};Basically what it does -- it keeps captured GIL until functor is finished and completely destructed. This approach completely cures the problem.
However the downside is that it dynamically allocates the inner lambda (that actually invokes func). With C++17 lambda will be constructed in-place without copy/move involved. But still this proposal may be sub-optimal.
So, I'm calling for core devs here for looking into this issue because it leads to rather severe limitations of Python callbacks usage.