feature: add noise_variance computation on oneDAL side #3101

Alexandr-Solovev · 2025-03-05T16:19:00Z

Changes Summary

Added noise_variance computation
Implemented calculation of noise variance as part of the PCA result options. This allows better estimation of the unexplained variance in the dataset.
Added syevr-based eigen decomposition function
Introduced a new function utilizing LAPACK's syevr routine to improve the performance of eigenvector and eigenvalue computations for symmetric matrices.

PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
I have provided justification why performance has changed or why changes are not expected.
I have provided justification why quality metrics have changed or why changes are not expected.
I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

Alexandr-Solovev · 2025-06-19T10:20:43Z

/intelci: run

icfaust

Hopefully this review is useful, I see some basic for loops which could use pragmas, and then some locations where its done on CPU where maybe a GPU solution could also be done without too much work.

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

icfaust · 2025-06-20T08:35:03Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+    DAAL_INT iu = static_cast<DAAL_INT>(nFeatures);
+    DAAL_INT m;
+    DAAL_INT info;
+    // Could be modified to be a function parameter


I'd add a little snippet referring to the application notes for values < 0 (https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2025-0/syevx.html)

icfaust · 2025-06-20T08:35:46Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+        return services::Status(services::ErrorPCAFailedToComputeCorrelationEigenvalues);
+    }
+
+    for (size_t i = 0; i < nComponents; ++i)


Any way to add vectorization pragmas here? (thoughts @Vika-F ?)

I haven't merged the PR with pragmas change yet :-\ need to collect the perf data.
But I think the inner loop should be easily vectorizable, maybe even without the pragmas. But pragmas can help to guide the compiler.

My opinion that vectorization pragmas could be added later

Make/share a ticket for that. I don't think it would be too bad if @Vika-F 's PR is merged soon.

icfaust · 2025-06-20T08:37:02Z

cpp/oneapi/dal/algo/pca/backend/cpu/finalize_train_kernel_cov.cpp

+        auto eigvals = arr_eigval.get_data();
+        auto total_variance = arr_vars.get_data();
+
+        double noiseVariance = 0.0;


pragmas here too?

icfaust · 2025-06-20T08:38:15Z

cpp/oneapi/dal/algo/pca/backend/cpu/train_kernel_cov.cpp

@@ -126,6 +126,21 @@ static result_t call_daal_kernel(const context_cpu& ctx,
        result.set_singular_values(homogen_table::wrap(arr_singular_values, 1, component_count));
    }

+    if (desc.get_result_options().test(result_options::noise_variance)) {


Is this done twice once for batch and once for incremental?

no, in general it should be done once for each of method:
batch: cov, svd, precomputed
incrementa: svd, cov

cpp/oneapi/dal/algo/pca/backend/cpu/train_kernel_svd.cpp

cpp/oneapi/dal/algo/pca/backend/gpu/finalize_train_kernel_cov_impl_dpc.cpp

icfaust · 2025-06-20T08:41:43Z

cpp/oneapi/dal/algo/pca/backend/gpu/finalize_train_kernel_cov_impl_dpc.cpp

+        std::int64_t rows_count_ = rows_count_global;
+        auto range = std::min(rows_count_, column_count) - component_count;
+        auto noise_variance =
+            compute_noise_variance_on_host(q, eigvals_host_tmp, range, { syevd_event });


If I am reading this right, this would mean that the compute_noise_variance is only done on host? If so please note this in the code and in the description (possibly as a follow up, or why it is done this way).

icfaust · 2025-06-20T08:42:38Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+    auto eigvals_ptr = eigenvalues.get_data();
+
+    Float sum = 0;
+    for (std::int64_t i = 0; i < range; ++i) {


reason not to have an equivalent add reduce call on GPU?

I guess its a performance question, usually the range is kinda small and reduce(parallel_for kerenl initialization) takes more time than move data to cpu and compute via cpu

I'd leave a comment behind that design choice then, as that should help guide developers who may see this in the future.

icfaust · 2025-06-20T08:43:55Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+    Float sum = 0;
+    for (std::int64_t i = 0; i < column_count; ++i) {
+        sum += vars_ptr[i];
+    }
+
+    ONEDAL_ASSERT(sum > 0);
+    const Float inverse_sum = 1.0 / sum;


Is there a reason why this was done on CPU?

Vika-F · 2025-06-20T12:25:53Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

-    copyArray(nFeatures * nComponents, fullEigenvectorsArray, eigenvectorsArray);
-    copyArray(nComponents, fullEigenvaluesArray, eigenvaluesArray);
+    // SYEVR branch
+    // In this case, we compute only nComponents eigenvectors and then sort them in descending order


Sorting is mentioned in the comment, but there is no sorting in the code. Is it Ok?

The sorting for syevr is already included in computeEigenVectorsInplaceSyevr, but I can split the functions

To my opinion, it would be better to align the behavior of computeEigenvectorsInplace and computeEigenvectorsInplaceSyevr: either to include sorting into both, or to exclude sorting from both.

Otherwise, it is harder to get what's happening in the code.

All lapack functions for symmetric eigenvalues produce the results in sorted order by design.

I mean descending and ascending ordering sort

Alexandr-Solovev · 2025-06-20T13:43:52Z

/intelci: run

Copilot

Pull Request Overview

This PR introduces noise variance computation into oneDAL’s PCA implementation and adds a new syevr‐based eigen decomposition function to enhance performance and improve variance estimation.

Added noise variance getter/setter and corresponding computation in both CPU and GPU backends.
Updated tests and common option identifiers to support noise variance.
Removed legacy sign flip code where replaced with GPU-specific implementations.

Reviewed Changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
examples/oneapi/dpc/source/pca/pca_cor_dense_batch.cpp	Added console output for noise variance in PCA train result.
cpp/oneapi/dal/algo/pca/train_types.{hpp,cpp}	Introduced noise variance getter/setter and implementation functions.
cpp/oneapi/dal/algo/pca/common.{hpp,cpp}	Added noise_variance result option and updated indices consistently.
cpp/oneapi/dal/algo/pca/backend/* (multiple files)	Integrated noise variance computation and replaced sign_flip usage.
cpp/daal/src/algorithms/pca/*	Updated DAAL integration to compute noise variance and added syevr branch.

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i

Vika-F · 2025-06-23T09:55:07Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

@@ -91,6 +91,60 @@ auto syevd_computation(sycl::queue& queue,
    return std::make_tuple(eigenvalues, syevd_event);
 }

+template <typename Float>
+auto prepare_eigenvectors_svd(sycl::queue& queue,


Please add a description.

david-cortes-intel · 2025-06-23T10:39:22Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+    // SYEVR branch
+    // In this case, we compute only nComponents eigenvectors and then sort them in descending order
+    // inside the 'computeEigenvectorsInplaceSyevr' function
+    if (nComponents < nFeatures)


Could it also limit the components by the number of rows in the data by this point? Is that info available here?

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

david-cortes-intel · 2025-06-23T10:47:55Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+        eigenvalues[i] = temp_eigenvalues[idx];
+        for (size_t j = 0; j < nFeatures; ++j)
+        {
+            eigenvectors[j + i * nFeatures] = temp_eigenvectors[j + idx * nFeatures];


There's a dedicated lapack function to reorder columns:
https://www.netlib.org/lapack/explore-3.2.1-html/dlapmt.f.html

Plus there's C++ 'reverse' for vectors:
https://en.cppreference.com/w/cpp/algorithm/reverse.html

david-cortes-intel · 2025-06-23T10:48:52Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+    for (size_t i = 0; i < nComponents; ++i)
+    {
+        size_t idx     = nComponents - 1 - i;
+        eigenvalues[i] = temp_eigenvalues[idx];


Shouldn't this be discarding components with too small eigenvalues?

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

david-cortes-intel · 2025-06-23T10:53:24Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+
+        Float max_val = row[0];
+        Float abs_max = std::abs(row[0]);
+        for (std::int64_t j = 1; j < column_count; j++) {


Would it perhaps be faster to use idamax from blas?

I will add a todo for investigation. For now looks like makes the pr bigger

david-cortes-intel · 2025-06-23T10:56:10Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+    auto explained_variances_ratio_ptr = explained_variances_ratio.get_mutable_data();
+
+    Float sum = 0;
+    for (std::int64_t i = 0; i < column_count; ++i) {


Could oneDPL be used for this kind of things?

I will add a todo mark, but may be done in next pr

david-cortes-intel · 2025-06-23T10:58:02Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+    auto eigvals_ptr = eigenvalues.get_data();
+    auto singular_values_ptr = singular_values.get_mutable_data();
+
+    const Float factor = row_count - 1;


Why '-1' here? Wouldn't this make the result not meet the necessary property that $(\mathbf{X} \mathbf{Z})^T (\mathbf{X} \mathbf{Z}) = \mathbf{I}$

david-cortes-intel · 2025-06-23T10:59:16Z

cpp/oneapi/dal/algo/pca/backend/gpu/misc.hpp

+    auto compute_event = queue.submit([&](sycl::handler& h) {
+        h.depends_on(deps);
+        h.parallel_for(sycl::range<1>(component_count), [=](sycl::id<1> i) {
+            singular_values_ptr[i] = sycl::sqrt(factor * eigvals_ptr[i]);


syevr only generates results that are valid up to numerical tolerance. So a very small eigenvalue that should in theory be positive or zero could still end up as a very small negative number.

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i

Alexandr-Solovev · 2025-07-03T11:12:20Z

cpp/daal/src/algorithms/pca/pca_dense_base.h

@@ -45,6 +45,8 @@ class PCADenseBase : public Kernel
    services::Status computeExplainedVariancesRatio(const data_management::NumericTable & eigenvalues,
                                                    const data_management::NumericTable & variances,
                                                    data_management::NumericTable & explained_variances_ratio);
+    services::Status computeNoiseVariances(const data_management::NumericTable & eigenvalues, const data_management::NumericTable & variances,
+                                           double & noise_variance);


@Vika-F does it make sense to name the variable noiseVariance for daal like style?

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

david-cortes-intel · 2025-08-14T06:03:59Z

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i

+        algorithmFPType tmp     = eigenvalues[i];
+        algorithmFPType tmp_rev = eigenvalues[nComponents - 1 - i];
+
+        if (tmp < 0) tmp = algorithmFPType(0);


When a singular value should in theory be zero when calculated to infinite precision, lapack will very likely output a number (either positive or negative) in the neighborhood of zero. The approach of discarding values below machine epsilon is more sensible given the way lapack works.

Can you clarify please in terms of code whats your suggestion?

Suggestion is to compare the (signed) values against a (positive) threshold - e.g.:

if (tmp < thr) { // discard } else { // take

Alexandr-Solovev added 4 commits March 5, 2025 08:07

init commit with adding noise_variance computation

3e518a9

fixes for cpu part

eda0349

minor fixes

2115885

Merge branch 'uxlfoundation:main' into dev/asolovev_pca_polishing

748efc0

Alexandr-Solovev mentioned this pull request Mar 21, 2025

add noise_variance computation from cpp side uxlfoundation/scikit-learn-intelex#2371

Draft

13 tasks

Alexandr-Solovev added 8 commits June 11, 2025 20:15

Merge branch 'uxlfoundation:main' into dev/asolovev_pca_polishing

2b3156e

fix

614a2f8

Merge branch 'uxlfoundation:main' into dev/asolovev_pca_polishing

9d783b3

add debug output

4605f46

minor fix

e2888cf

fix

92ff4c5

minor fixes

19530b8

minor fix

e18b2bc

Alexandr-Solovev commented Jun 17, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i Show resolved Hide resolved

Alexandr-Solovev commented Jun 17, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i Outdated Show resolved Hide resolved

Alexandr-Solovev added 4 commits June 18, 2025 03:24

fix

553465c

Merge branch 'uxlfoundation:main' into dev/asolovev_pca_polishing

f874b31

fixes

0a5658c

fixes

79c20ee

Alexandr-Solovev added the dpc++ Issue/PR related to DPC++ functionality label Jun 19, 2025

Merge branch 'main' into dev/asolovev_pca_polishing

9131d60

Alexandr-Solovev marked this pull request as ready for review June 20, 2025 07:35

Alexandr-Solovev requested review from Alexsandruss, avolkov-intel, KateBlueSky, icfaust, ethanglaser, david-cortes-intel and maria-Petrova as code owners June 20, 2025 07:35

icfaust reviewed Jun 20, 2025

View reviewed changes

Vika-F reviewed Jun 20, 2025

View reviewed changes

fixes

4675b75

fixes

280acf2

icfaust requested a review from Copilot June 23, 2025 04:56

Copilot AI reviewed Jun 23, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i Show resolved Hide resolved

fixes

10d295b

Vika-F reviewed Jun 23, 2025

View reviewed changes

david-cortes-intel reviewed Jun 23, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i Outdated Show resolved Hide resolved

Alexandr-Solovev added 3 commits July 3, 2025 10:25

Merge branch 'main' into dev/asolovev_pca_polishing

6112787

fixes

be8dc1a

Merge branch 'main' into dev/asolovev_pca_polishing

c2e91a7

Alexandr-Solovev commented Jul 3, 2025

View reviewed changes

minor fix

405ceed

david-cortes-intel reviewed Jul 7, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_base_impl.i Outdated Show resolved Hide resolved

Alexandr-Solovev added 5 commits July 7, 2025 06:26

fixes

059c802

fixes

1ec4172

Merge branch 'main' into dev/asolovev_pca_polishing

886bccf

Merge branch 'uxlfoundation:main' into dev/asolovev_pca_polishing

a066d81

fix negative

1d73325

david-cortes-intel reviewed Aug 13, 2025

View reviewed changes

cpp/daal/src/algorithms/pca/pca_dense_correlation_base_impl.i Outdated Show resolved Hide resolved

fixes

fd43765

david-cortes-intel reviewed Aug 14, 2025

View reviewed changes

Alexandr-Solovev added 5 commits August 18, 2025 07:19

minor fix

0cc5f37

Merge branch 'main' into dev/asolovev_pca_polishing

28318d9

fixes

5e23c8e

Merge branch 'main' into dev/asolovev_pca_polishing

dbada1d

fixes

f6ae492

feature: add noise_variance computation on oneDAL side #3101

Are you sure you want to change the base?

feature: add noise_variance computation on oneDAL side #3101

Uh oh!

Conversation

Alexandr-Solovev commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Summary

Uh oh!

Uh oh!

Uh oh!

Alexandr-Solovev commented Jun 19, 2025

Uh oh!

icfaust left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alexandr-Solovev commented Jun 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Alexandr-Solovev commented Mar 5, 2025 •

edited

Loading

david-cortes-intel Jun 23, 2025 •

edited

Loading