Skip to content

Conversation

david-cortes-intel
Copy link
Contributor

Description

This PR adds the same fix as in #3293 for the prediction function that outputs log probabilities, so that sklearnex's predict_log_proba doesn't end up returning infinites.


PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have added a respective label(s) to PR if I have a permission for that.
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.

@david-cortes-intel
Copy link
Contributor Author

CI job for sklearnex requires this PR to be merged in order to pass:
uxlfoundation/scikit-learn-intelex#2653

@Alexsandruss
Copy link
Contributor

/intelci: run

1 similar comment
@Alexsandruss
Copy link
Contributor

/intelci: run

Copy link
Contributor

@Alexsandruss Alexsandruss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segfault appears in the Pre-Commit:

Fatal Python error: Segmentation fault
__release_lnx/daal4py-3.12/daal4py/mb/logistic_regression_builders.py", line 169 in predict_log_proba
sources/tests/test_model_builders.py::test_logreg_builder[2-True-False] 

@david-cortes-intel
Copy link
Contributor Author

/intelci: run

@david-cortes-intel
Copy link
Contributor Author

I am unable to reproduce any segfault locally. Tried also running with asan and valgrind under different compilers, and didn't see any reported memory issues.

@david-cortes-intel
Copy link
Contributor Author

I was able to reproduce the issue on the same machine where the CI error was reported. It appears to be an issue of not checking for null pointers or error codes somwhere, most likely on the daal4py side:

==3694450== Invalid read of size 8
==3694450==    at 0x42035D4: UnknownInlinedFun (pycore_pystate.h:133)
==3694450==    by 0x42035D4: UnknownInlinedFun (obmalloc.c:866)
==3694450==    by 0x42035D4: UnknownInlinedFun (obmalloc.c:1850)
==3694450==    by 0x42035D4: PyObject_Free (obmalloc.c:830)
==3694450==    by 0x3EF89483: Py_DECREF (object.h:705)
==3694450==    by 0x3EF89483: Py_XDECREF (object.h:798)
==3694450==    by 0x3EF89483: ~NpyNumericTable (npy4daal.h:372)
==3694450==    by 0x3EF89483: NpyNumericTable<NpyNonContigHandler>::~NpyNumericTable() (npy4daal.h:372)
==3694450==    by 0x3F5EBEBF: _remove (daal_shared_ptr.h:352)
==3694450==    by 0x3F5EBEBF: operator= (daal_shared_ptr.h:366)
==3694450==    by 0x3F5EBEBF: daal::algorithms::interface1::Argument::set(unsigned long, daal::services::interface1::SharedPtr<daal::data_management::interface1::SerializationIface> const&) (algorithm_base_impl.cpp:73)
==3694450==    by 0x3F5F7BB5: daal::algorithms::classifier::prediction::interface1::Input::set(daal::algorithms::classifier::prediction::NumericTableInputId, daal::services::interface1::SharedPtr<daal::data_management::interface1::NumericTable> const&) (classifier_predict.cpp:92)
==3694450==    by 0x3ECDF60E: logistic_regression_prediction_manager<double, (daal::algorithms::logistic_regression::prediction::Method)0>::batch(bool) (daal4py_cpp.h:5609)
==3694450==    by 0x3ECDF51A: logistic_regression_prediction_manager<double, (daal::algorithms::logistic_regression::prediction::Method)0>::compute(data_or_file const&, daal::services::interface1::SharedPtr<daal::algorithms::logistic_regression::interface1::Model>*, bool) (daal4py_cpp.h:5630)
==3694450==    by 0x3EE7AA4C: __pyx_pf_7daal4py_8_daal4py_30logistic_regression_prediction_4_compute (daal4py_cy.cpp:205893)
==3694450==    by 0x3EE7AA4C: __pyx_pw_7daal4py_8_daal4py_30logistic_regression_prediction_5_compute(_object*, _object* const*, long, _object*) (daal4py_cy.cpp:205711)
==3694450==    by 0x4217165: UnknownInlinedFun (pycore_call.h:92)
==3694450==    by 0x4217165: PyObject_VectorcallMethod (call.c:887)
==3694450==    by 0x3EE7B178: __pyx_pf_7daal4py_8_daal4py_30logistic_regression_prediction_6compute (daal4py_cy.cpp:206102)
==3694450==    by 0x3EE7B178: __pyx_pw_7daal4py_8_daal4py_30logistic_regression_prediction_7compute(_object*, _object* const*, long, _object*) (daal4py_cy.cpp:206051)
==3694450==    by 0x4233CEE: UnknownInlinedFun (pycore_call.h:92)
==3694450==    by 0x4233CEE: PyObject_Vectorcall (call.c:325)
==3694450==    by 0x421CDA8: _PyEval_EvalFrameDefault (bytecodes.c:2715)
==3694450==    by 0x4214BCA: _PyObject_FastCallDictTstate (call.c:144)
==3694450==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==3694450== 
Fatal Python error: Segmentation fault

Doesn't appear to be related to the changes here though. Will investigate further.

@david-cortes-intel
Copy link
Contributor Author

Will need to merge this PR before in order for the CI jobs to pass:
uxlfoundation/scikit-learn-intelex#2665

@david-cortes-intel
Copy link
Contributor Author

/intelci: run

@david-cortes-intel david-cortes-intel marked this pull request as ready for review August 19, 2025 10:42
@david-cortes-intel
Copy link
Contributor Author

/intelci: run

@david-cortes-intel david-cortes-intel merged commit bd4dce3 into uxlfoundation:main Aug 20, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants