FIX: Prevent log probabilities from turning to infinite #3315

david-cortes-intel · 2025-08-05T13:22:54Z

Description

This PR adds the same fix as in #3293 for the prediction function that outputs log probabilities, so that sklearnex's predict_log_proba doesn't end up returning infinites.

PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.

david-cortes-intel · 2025-08-06T06:52:20Z

CI job for sklearnex requires this PR to be merged in order to pass:
uxlfoundation/scikit-learn-intelex#2653

Alexsandruss · 2025-08-07T09:49:59Z

/intelci: run

Alexsandruss · 2025-08-07T12:43:13Z

/intelci: run

Alexsandruss

Segfault appears in the Pre-Commit:

Fatal Python error: Segmentation fault
__release_lnx/daal4py-3.12/daal4py/mb/logistic_regression_builders.py", line 169 in predict_log_proba
sources/tests/test_model_builders.py::test_logreg_builder[2-True-False]

david-cortes-intel · 2025-08-07T14:12:14Z

/intelci: run

david-cortes-intel · 2025-08-07T15:18:27Z

I am unable to reproduce any segfault locally. Tried also running with asan and valgrind under different compilers, and didn't see any reported memory issues.

david-cortes-intel · 2025-08-08T07:06:53Z

I was able to reproduce the issue on the same machine where the CI error was reported. It appears to be an issue of not checking for null pointers or error codes somwhere, most likely on the daal4py side:

==3694450== Invalid read of size 8
==3694450==    at 0x42035D4: UnknownInlinedFun (pycore_pystate.h:133)
==3694450==    by 0x42035D4: UnknownInlinedFun (obmalloc.c:866)
==3694450==    by 0x42035D4: UnknownInlinedFun (obmalloc.c:1850)
==3694450==    by 0x42035D4: PyObject_Free (obmalloc.c:830)
==3694450==    by 0x3EF89483: Py_DECREF (object.h:705)
==3694450==    by 0x3EF89483: Py_XDECREF (object.h:798)
==3694450==    by 0x3EF89483: ~NpyNumericTable (npy4daal.h:372)
==3694450==    by 0x3EF89483: NpyNumericTable<NpyNonContigHandler>::~NpyNumericTable() (npy4daal.h:372)
==3694450==    by 0x3F5EBEBF: _remove (daal_shared_ptr.h:352)
==3694450==    by 0x3F5EBEBF: operator= (daal_shared_ptr.h:366)
==3694450==    by 0x3F5EBEBF: daal::algorithms::interface1::Argument::set(unsigned long, daal::services::interface1::SharedPtr<daal::data_management::interface1::SerializationIface> const&) (algorithm_base_impl.cpp:73)
==3694450==    by 0x3F5F7BB5: daal::algorithms::classifier::prediction::interface1::Input::set(daal::algorithms::classifier::prediction::NumericTableInputId, daal::services::interface1::SharedPtr<daal::data_management::interface1::NumericTable> const&) (classifier_predict.cpp:92)
==3694450==    by 0x3ECDF60E: logistic_regression_prediction_manager<double, (daal::algorithms::logistic_regression::prediction::Method)0>::batch(bool) (daal4py_cpp.h:5609)
==3694450==    by 0x3ECDF51A: logistic_regression_prediction_manager<double, (daal::algorithms::logistic_regression::prediction::Method)0>::compute(data_or_file const&, daal::services::interface1::SharedPtr<daal::algorithms::logistic_regression::interface1::Model>*, bool) (daal4py_cpp.h:5630)
==3694450==    by 0x3EE7AA4C: __pyx_pf_7daal4py_8_daal4py_30logistic_regression_prediction_4_compute (daal4py_cy.cpp:205893)
==3694450==    by 0x3EE7AA4C: __pyx_pw_7daal4py_8_daal4py_30logistic_regression_prediction_5_compute(_object*, _object* const*, long, _object*) (daal4py_cy.cpp:205711)
==3694450==    by 0x4217165: UnknownInlinedFun (pycore_call.h:92)
==3694450==    by 0x4217165: PyObject_VectorcallMethod (call.c:887)
==3694450==    by 0x3EE7B178: __pyx_pf_7daal4py_8_daal4py_30logistic_regression_prediction_6compute (daal4py_cy.cpp:206102)
==3694450==    by 0x3EE7B178: __pyx_pw_7daal4py_8_daal4py_30logistic_regression_prediction_7compute(_object*, _object* const*, long, _object*) (daal4py_cy.cpp:206051)
==3694450==    by 0x4233CEE: UnknownInlinedFun (pycore_call.h:92)
==3694450==    by 0x4233CEE: PyObject_Vectorcall (call.c:325)
==3694450==    by 0x421CDA8: _PyEval_EvalFrameDefault (bytecodes.c:2715)
==3694450==    by 0x4214BCA: _PyObject_FastCallDictTstate (call.c:144)
==3694450==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==3694450== 
Fatal Python error: Segmentation fault

Doesn't appear to be related to the changes here though. Will investigate further.

david-cortes-intel · 2025-08-14T13:37:40Z

Will need to merge this PR before in order for the CI jobs to pass:
uxlfoundation/scikit-learn-intelex#2665

david-cortes-intel · 2025-08-19T10:42:46Z

/intelci: run

david-cortes-intel · 2025-08-19T13:52:07Z

/intelci: run

prevent logloss from turning to inf in predictions

1c95f80

david-cortes-intel requested a review from avolkov-intel August 5, 2025 13:22

david-cortes-intel added the bug label Aug 5, 2025

david-cortes-intel requested review from Alexsandruss, Alexandr-Solovev, KateBlueSky, icfaust and ethanglaser as code owners August 5, 2025 13:22

This was referenced Aug 5, 2025

FIX: Prevent log probabilities produced by array API from turning to infinite uxlfoundation/scikit-learn-intelex#2651

Merged

MAINT: Relax test conditions for log-probabilities in model builders uxlfoundation/scikit-learn-intelex#2653

Merged

Alexsandruss requested changes Aug 7, 2025

View reviewed changes

david-cortes-intel marked this pull request as draft August 8, 2025 07:24

david-cortes-intel mentioned this pull request Aug 14, 2025

FIX: Avoid re-usage of non-reusable daal4py objects uxlfoundation/scikit-learn-intelex#2665

Merged

8 tasks

Merge branch 'main' into fix_inf_logp

2a926d2

david-cortes-intel marked this pull request as ready for review August 19, 2025 10:42

Alexsandruss approved these changes Aug 19, 2025

View reviewed changes

david-cortes-intel merged commit bd4dce3 into uxlfoundation:main Aug 20, 2025
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Prevent log probabilities from turning to infinite #3315

FIX: Prevent log probabilities from turning to infinite #3315

Uh oh!

david-cortes-intel commented Aug 5, 2025

Uh oh!

david-cortes-intel commented Aug 6, 2025

Uh oh!

Alexsandruss commented Aug 7, 2025

Uh oh!

Alexsandruss commented Aug 7, 2025

Uh oh!

Alexsandruss left a comment

Uh oh!

david-cortes-intel commented Aug 7, 2025

Uh oh!

david-cortes-intel commented Aug 7, 2025

Uh oh!

david-cortes-intel commented Aug 8, 2025

Uh oh!

david-cortes-intel commented Aug 14, 2025

Uh oh!

david-cortes-intel commented Aug 19, 2025

Uh oh!

david-cortes-intel commented Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!

FIX: Prevent log probabilities from turning to infinite #3315

FIX: Prevent log probabilities from turning to infinite #3315

Uh oh!

Conversation

david-cortes-intel commented Aug 5, 2025

Description

Uh oh!

david-cortes-intel commented Aug 6, 2025

Uh oh!

Alexsandruss commented Aug 7, 2025

Uh oh!

Alexsandruss commented Aug 7, 2025

Uh oh!

Alexsandruss left a comment

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel commented Aug 7, 2025

Uh oh!

david-cortes-intel commented Aug 7, 2025

Uh oh!

david-cortes-intel commented Aug 8, 2025

Uh oh!

david-cortes-intel commented Aug 14, 2025

Uh oh!

david-cortes-intel commented Aug 19, 2025

Uh oh!

david-cortes-intel commented Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!