Skip to content

Conversation

@JAi-SATHVIK
Copy link
Contributor

@JAi-SATHVIK JAi-SATHVIK commented Jan 6, 2026

@JAi-SATHVIK JAi-SATHVIK marked this pull request as draft January 6, 2026 02:37
@codecov
Copy link

codecov bot commented Jan 6, 2026

Codecov Report

❌ Patch coverage is 0% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.60%. Comparing base (d2fdd50) to head (f1d5182).
⚠️ Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
example/intrinsics/example_matmul.f90 0.00% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1086      +/-   ##
==========================================
- Coverage   68.69%   68.60%   -0.09%     
==========================================
  Files         393      394       +1     
  Lines       12720    12730      +10     
  Branches     1376     1376              
==========================================
- Hits         8738     8734       -4     
- Misses       3982     3996      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jalvesz jalvesz linked an issue Jan 6, 2026 that may be closed by this pull request
@jalvesz
Copy link
Contributor

jalvesz commented Jan 7, 2026

@JAi-SATHVIK the build fails happen because in the CMakeLists.txt for the stats module you need to add something like target_link_libraries(stats PUBLIC blas lapack)

Also, in my previous comment regarding the kinds support, what I meant was: please enable all kinds, stdlib provids the backends. Linking against optimized libraries is optional and would increase performance for simple and double precision. But all kinds are supported.

@JAi-SATHVIK
Copy link
Contributor Author

Thank you @jalvesz , for the clarification! I've made the following updates:

CMakeLists.txt: The target_link_libraries(stats PUBLIC blas lapack) is already in place in src/stats/CMakeLists.txt which links the stats module against stdlib's internal BLAS/LAPACK targets.

Documentation: Updated the inline comments in both
stdlib_stats_pca.fypp and stdlib_stats.fypp to accurately reflect that:
All real kinds (sp, dp, xdp, qp) are supported by stdlib's internal BLAS/LAPACK backends.

@jalvesz
Copy link
Contributor

jalvesz commented Jan 7, 2026

New fails are happening because of the dependency on the sorting module. This exposes an issue with the modularization of the library. The sorting module being at the root of src is not visible by the stats module within its own independent folder... we might need to reconsider next steps.

One idea would be to make this library (pca) not a submodule of stats but a module in itself at the root of src such that it can use easily any other module such as stats, blas/lapack and sorting. Or there might be another approach to think about

Cc @jvdp1 @perazz

@jvdp1
Copy link
Member

jvdp1 commented Jan 7, 2026

New fails are happening because of the dependency on the sorting module. This exposes an issue with the modularization of the library. The sorting module being at the root of src is not visible by the stats module within its own independent folder...

In this case, add ../stdlib_sorting.fypp in the CMakeLists.txt should be enough (similarly to stdlib_string_type.fypp already present in CMakeLists.txt)

we might need to reconsider next steps.

However, I agree with that. When I was working on #1081, I started to get "faked" circular dependencies.

One idea would be to make this library (pca) not a submodule of stats but a module in itself at the root of src such that it can use easily any other module such as stats, blas/lapack and sorting. Or there might be another approach to think about

If the CMake file is correctly written, a submodule "pca" should not be a problem.
However, in terms of efficiency for the stats module (but also for other modules, like stdlib_linalg), it might be good to use modules instead of submodules: as fpm compiles only what is needed, if a user only needs mean, then the blas and lapack modules are currently not compiled (if I am correct). However, if the pca procedures are added as submodules of the stats module, then the blas and lapack will be compiled, even if only the procedure mean is used by the user. Similar "issues" can easily happen for stdlib_linalg.

@JAi-SATHVIK
Copy link
Contributor Author

Thank you @jalvesz @jvdp1 for the insights.

Regarding the immediate build failure, I will try adding ../stdlib_sorting.fypp to the src/stats/CMakeLists.txt as suggested to resolve the visibility issue with the sorting module.

Regarding the structural change: I'm open to moving PCA to a standalone module in src/ if that aligns better with the library's goals for modularity and reducing compilation overhead. Should I proceed with the CMake fix first to verify the current logic, or would you prefer I start refactoring it into its own module now?

@jalvesz
Copy link
Contributor

jalvesz commented Jan 8, 2026

I'll suggest to go step by step: first try to fix "as is", then let's continue the discussion on what would be the best strategy.

@JAi-SATHVIK
Copy link
Contributor Author

Hi, @jvdp1

I've addressed the previous review comments (code cleanup, precision support, etc.), but I'm consistently hitting an Internal Compiler Error (Segmentation Fault) with the Intel ifx compiler in the CI.
The error points to the generated stdlib_stats_pca submodule. I suspect this is triggered by the allocate(var, source=expr) constructs.
Is there a preferred workaround for these types of compiler bugs in stdlib?

I propose refactoring the remaining allocate(..., source=...) calls to use explicit allocation and assignment (e.g., allocate(a(n)); a = expr).
Please let me know if you'd prefer I proceed with this workaround or if there are other potential causes I should investigate.

@jvdp1
Copy link
Member

jvdp1 commented Jan 17, 2026

Hi, @jvdp1

I've addressed the previous review comments (code cleanup, precision support, etc.), but I'm consistently hitting an Internal Compiler Error (Segmentation Fault) with the Intel ifx compiler in the CI. The error points to the generated stdlib_stats_pca submodule. I suspect this is triggered by the allocate(var, source=expr) constructs. Is there a preferred workaround for these types of compiler bugs in stdlib?

I propose refactoring the remaining allocate(..., source=...) calls to use explicit allocation and assignment (e.g., allocate(a(n)); a = expr). Please let me know if you'd prefer I proceed with this workaround or if there are other potential causes I should investigate.

I am surprised about this, because allocate(..., source=...) is widely used in stdlib in other modules. Currently, I have no idea what could be other causes. I will investigate that issue a bit.

@jalvesz
Copy link
Contributor

jalvesz commented Jan 17, 2026

Here is my take with respect to allocate(var, source=...) it works properly when the source term is either a scalar value or a known array. Expressions involving reductions and transformation of array data within this source assignment should not be "one-lined" like that, it is difficult to debug and prone to creating temporary arrays.

@jvdp1
Copy link
Member

jvdp1 commented Jan 17, 2026

Here is my take with respect to allocate(var, source=...) it works properly when the source term is either a scalar value or a known array.

If I am correct, it is the case in the last commit. However there are still issues.

Expressions involving reductions and transformation of array data within this source assignment should not be "one-lined" like that, it is difficult to debug and prone to creating temporary arrays.

I agree with you regarding debugging. However I would hope that the compiler can better optimize this code:

allocate(mean_ , source=mean(array, 1))

compared to (most likely resulting in a temporary array):

allocate(mean_(n))
mean_ = mean(array,1)

Aynway, @JAi-SATHVIK , I suggest to use the second version, until the code pass successfully with ifx.

call pca_eigh_driver_${k1}$(x_centered, n, p, components, singular_values, err0)
case default
err0 = linalg_state_type("pca", LINALG_ERROR, "Unknown method: "//method_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JAi-SATHVIK : Commenting this line solves the issue with ifx on my computer. I will investigate the reason.

\cc @jalvesz

call pca_eigh_driver_${k1}$(x_centered, n, p, components, singular_values, err0)
case default
err0 = linalg_state_type("pca", LINALG_ERROR, "Unknown method: "//method_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err0 = linalg_state_type("pca", LINALG_ERROR, "Unknown method: "//method_)
err0 = linalg_state_type("pca", LINALG_ERROR, "Unknown method: ", method_)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Principal Component Analysis (PCA) module

5 participants