Polar decomposition QDWH by dsukkari · Pull Request #2 · icl-utk-edu/slate

dsukkari · 2023-01-27T15:42:10Z

The polar decomposition QDWH of a general matrix A = U * H, where U is orthogonal polar factor and H is hermitian polar factor.

QDWH iterations rely on Cholesky based and QR based iterations to compute the orthogonal polar factor U.

For the QR based iterations, new customized geqrf_qdwh_full and unmqr_qdwh_full are included to take advantage of the identity structure of the matrix involved during the QR based iterations.

The 2-norm estimate (norm2est) of the original matrix is required, the norm2est using power iteration is implemented and called in QDWH.

The following figure present the performance of SLATE_QDWH on Summit using various number of nodes.

…oid having a clean up tile in the middle of the W matrix

…ts to slate calls

… != q. Changed dd computing in qdwh for-now and minor changes

mgates3 · 2023-01-27T16:10:09Z

Check for warnings, i.e., add -Werror to CXXFLAGS in make.inc.

…trcondest. Reduced the condition number of the tested matrix

mgates3

First pass through. Probably more changes later on.

GNUmakefile

src/geqrf_qdwh_full.cc

src/qdwh.cc

src/unmqr_qdwh_full.cc

test/run_tests.py

…wh_full

…_test.py

src/norm2est.cc

src/qdwh.cc

src/unmqr_qdwh_full.cc

…or update on gflops count

…from geqrf and geqrf_qdwh_full

dsukkari · 2023-02-08T01:40:58Z

All test passed, except one failure for gels using cholqr.

mgates3 · 2023-11-13T19:17:54Z

test/test_qdwh.cc

+                                      lapack::Gflop<scalar_t>::potrf(m)      +
+                                      blas::Gflop<scalar_t>::trsm(slate::Side::Left, m, n) );
+
+        double gflop_compute_H = blas::Gflop<scalar_t>::her2k(n, m);


Currently, this is really gemm, but eventually it should be herk (i.e., herkx) instead of her2k.

(I'll fix.)

mgates3

Ack! Pending comments from a long time ago. I didn't confirm if these make sense.

mgates3 · 2023-11-13T22:27:03Z

src/qdwh.cc

+            slate_error("Failed to converge.");
+        }
+        itconv++;
+


I see a previous commit about using double to avoid overflow. Can we just use double for everything instead of casting everything to real_t?

We can simplify by using some constants, e.g., const real_t r2 = 2.0. Or in double, just use constants 1.0, 2.0, etc. in the formulas.

mgates3 · 2023-11-13T22:28:11Z

src/qdwh.cc

+        printf("\nConverged after %d. Check what is the issue, "
+                   "because QDWH needs <= 6 iterations.\n",
+                   itqr+itpo);
+    }


I moved these outputs to the tester, to maintain xSDK compatibility.

mgates3 · 2023-11-13T22:28:26Z

src/qdwh.cc

+    //her2k(one, A, W10, rzero, H, opts);
+    //auto AL = HermitianMatrix<scalar_t>(
+    //        slate::Uplo::Lower, H );
+    //slate::copy(AL, H, opts);


herkx or gemmtr, not her2k.

mgates3 · 2023-11-13T22:37:49Z

src/qdwh.cc

+        sqd = sqrt( r_one + real_t(dd) );
+        a1  = sqd + sqrt( real_t(8.0) - real_t(4.0) * real_t(dd) +
+              real_t(8.0) * ( real_t(2.0) - L2 ) / ( L2 * sqd ) ) / real_t(2.0);
+        a   = real(a1);


a1 and a are both double, so what does this real( ) call do?

mgates3 · 2023-11-13T22:59:45Z

src/qdwh.cc

+    auto R  = TriangularMatrix<scalar_t>(
+            Uplo::Upper, slate::Diag::NonUnit, R1 );
+    normR = norm(slate::Norm::One, R, opts);
+    slate::trcondest(slate::Norm::One, R, &Li, opts);


If trcondest takes Rnorm as does gecondest, then we can actually set Rnorm = 1.0 and avoid computing it entirely, since it just gets cancelled:
smin_est = Rnorm * rcond = Rnorm * 1 / (Rnorm * || R^{-1} ||_1) = 1 / || R^{-1} ||_1

Dalal Sukkari added 25 commits October 27, 2022 14:56

Tow-norm estimation

37a6768

Added the polar decomposition QDWH

9b47c71

Added costumized geqrf and unmqr

ea1f5d4

Test qdwh

554769f

call geqrf/unmqr_qdwh_full

3fa51a1

Added qdwh codes to slate.hh and to makefile

ad698cc

fixes on i_end and on releasing tiles

d9abb8f

Set C to zero will make gemmA work

8dcff02

minor cleaning and add todo

f379a01

Allocate W with number of rows is roundup(number of A rows) + n to av…

63df04a

…oid having a clean up tile in the middle of the W matrix

Fixes to have qdwh work for matrix with m > n

0278c69

add itqr and itpo to qdwh interace, and change H matrix to Hermitian

458b8b9

Save the hermitian polar factor into a general matrix for now. Add op…

44e7c65

…ts to slate calls

rename normest and cleaning it

1e5e034

change header of norm2est and call it in qdwh

12c151c

rename normest and minor change

c2e4870

delete

a0aa605

minor cleaning

c8516a9

pass opts to calls

c21a993

Use H innstead of new allocation in case of square matrix. Rename B to H

7e9c00e

Added exit with error if #itr > 100 in qdwh and minor cleaning

04f06d9

Used real_t to cast numbers inn while-loop and minor cleaning

ce9ef1c

Fixed W2 allocation in norm2est to have it work with grid pxq where p…

7c68831

… != q. Changed dd computing in qdwh for-now and minor changes

minor on printf

ca1db42

merge in the master

351ab6a

dsukkari requested review from ayarkhan, cayrols, dbielich and mgates3 January 27, 2023 15:42

minor

398703d

dsukkari force-pushed the polar branch from d84c0c6 to 398703d Compare January 30, 2023 18:29

Dalal Sukkari added 3 commits January 30, 2023 13:48

remove redifned SLATE_HAVE_SCALAPACK

d546c84

Added qdwh to run_tests. Trying to fix dd computing in qdwh and used …

6d48ec2

…trcondest. Reduced the condition number of the tested matrix

Removed an extra allocation

2acf396

mgates3 requested changes Feb 2, 2023

View reviewed changes

Dalal Sukkari added 3 commits February 2, 2023 09:53

Changed dd to avoid overflow

516cae6

test ill-condÃ

22e050a

Fix in makefil, used impl namesapce and minor to the docs in geqrf_qd…

6ea76e6

…wh_full

dsukkari force-pushed the polar branch from 3c8f673 to 6ea76e6 Compare February 2, 2023 16:13

Dalal Sukkari added 6 commits February 2, 2023 11:24

minor fixes

a7d423f

--amend

3bcf929

Applied Mark's comments

a09c758

Replaced namespace by impl in umnqr_qdwh_full. Changed n to mn in run…

cdfc10c

…_test.py

qdwh works m >= n

fab06a1

fix flops count

deec809

mgates3 reviewed Feb 3, 2023

View reviewed changes

src/norm2est.cc Outdated Show resolved Hide resolved

src/qdwh.cc Outdated Show resolved Hide resolved

src/unmqr_qdwh_full.cc Outdated Show resolved Hide resolved

Dalal Sukkari added 6 commits February 6, 2023 09:38

minor

169a63d

add const and delete dead code

fc35842

moving const up and minor changes

2be83cd

test matrices with a smaller cond

23bc075

Computed parameters in double to avoid overflow in complex-float. Min…

c8c88bb

…or update on gflops count

Added geqrf_compute_first_indices to internal_util.hh and removed it …

531ca62

…from geqrf and geqrf_qdwh_full

Dalal Sukkari added 3 commits February 9, 2023 09:58

fix flops count

8ce87db

Delete un needed matrix allocation

34e0afa

Changed device_malloc and used gemmC in norm2est (for now)

d25e419

mgates3 reviewed Nov 13, 2023

View reviewed changes

neil-lindquist mentioned this pull request Dec 5, 2023

Remove tile life infrastructure #151

Merged

mgates3 reviewed Feb 4, 2026

View reviewed changes

Conversation

dsukkari commented Jan 27, 2023

Uh oh!

mgates3 commented Jan 27, 2023

Uh oh!

mgates3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsukkari commented Feb 8, 2023

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

mgates3 left a comment

Choose a reason for hiding this comment

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

mgates3 Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants