Skip to content

Commit 9ef9bee

Browse files
authored
Polish the language in docstrings and README (#6)
1 parent 0bc76c6 commit 9ef9bee

File tree

2 files changed

+61
-63
lines changed

2 files changed

+61
-63
lines changed

README.md

Lines changed: 15 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
[![CI](https://github.com/HolyLab/GsvdInitialization.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/HolyLab/GsvdInitialization.jl/actions/workflows/CI.yml)
44
[![codecov](https://codecov.io/gh/HolyLab/GsvdInitialization.jl/graph/badge.svg?token=LxqRCsZIvn)](https://codecov.io/gh/HolyLab/GsvdInitialization.jl)
55

6-
This package includes the code of the paper 'GSVD-NMF: Recovering Missing Features in
7-
Non-negative Matrix Factorization`.
8-
It is used to recover Non-negative matrix factorization(NMF) components from low-dimensional space to higher dimensional space by exploiting the generalized singular value decomposition (GSVD) between existing NMF results and the SVD of X.
6+
This package implements the technique in the paper 'GSVD-NMF: Recovering Missing Features in
7+
Non-negative Matrix Factorization`.
8+
It is used to recover Non-negative matrix factorization(NMF) components from an initial lower-rank factorization by exploiting the generalized singular value decomposition (GSVD) between existing NMF results and the SVD of X.
99
This method allows the incremental expansion of the number of components, which can be convenient and effective for interactive analysis of large-scale data.
1010

1111
See also [NMFMerge](https://github.com/HolyLab/NMFMerge.jl) for the converse operation. Together, the two result in a substantial improvement in the quality and consistency of NMF factorization.
@@ -16,9 +16,9 @@ Demo:
1616

1717
To run this demo, NMF.jl and LinearAlgebra.jl are also required.
1818

19-
Install and load packages
19+
Install and load packages (type `]` at the `julia>` prompt to enter `pkg>` mode):
2020
```julia
21-
julia>] add GsvdInitialization;
21+
pkg> add GsvdInitialization;
2222
julia> using GsvdInitialization, NMF, LinearAlgebra;
2323
```
2424

@@ -46,15 +46,16 @@ The result is given by
4646

4747
<img src="demo/ResultHals.png" alt="Sample Figure" width="400"/>
4848

49-
This factorization is not perfect as two components are same and two features share one component.
50-
Then, running GSVD-NMF on X (also using NNSVD as initialization).
49+
This factorization is not perfect as two components are the same and two features share one component.
50+
Then, running GSVD-NMF on X (also using NNSVD as initialization) and computing the new reconstruction error:
5151

5252
```julia
5353
Wgsvd, Hgsvd = gsvdnmf(X, 9=>10; alg = :cd, tol_final = 1e-4, tol_intermediate = 1e-2, maxiter = 10^12);
5454
julia> sum(abs2, X-Wgsvd*Hgsvd)/sum(abs2, X)
5555
1.2322603074132593e-10
5656
```
57-
Gsvd-NMF factorizes the gound truth well based on the comparison between relative fitting errors and figures.
57+
An imperfect factorization from `nnmf` alone was augmented by `gsvdnmf` to a perfect factorization.
58+
Here are the new components:
5859

5960
<img src="demo/ResultGsvdNMF.png" alt="Sample Figure" width="400"/>
6061

@@ -73,7 +74,7 @@ Arguments:
7374

7475
``ncomponents::Pair{Int,Int}``: in the form of ``n1 => n2``, augments from ``n1`` components to ``n2``components, where ``n1`` is the number of components for initial NMF (under-complete NMF), and ``n2`` is the number of components for final NMF.
7576

76-
Alternatively, ``ncomponents`` can be an integer denoting the number of components for final NMF.
77+
Alternatively, ``ncomponents`` can be an integer denoting the number of components for final NMF.
7778
In this case, ``gsvdnmf`` defaults to augment components on initial NMF solution by 1.
7879

7980
Keyword arguments:
@@ -86,9 +87,9 @@ Other keyword arguments are passed to ``NMF.nnmf``.
8687

8788
-----
8889

89-
W, H = **gsvdnmf**(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
90-
n2 = size(first(f), 2),
91-
tol_nmf=1e-4,
90+
W, H = **gsvdnmf**(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
91+
n2 = size(first(f), 2),
92+
tol_nmf=1e-4,
9293
kwargs...)
9394

9495
This funtion augments components for ``W`` and ``H``, and subsequently polishs new ``W`` and ``H`` by NMF.
@@ -105,7 +106,7 @@ Arguments:
105106

106107
``f``: SVD (or Truncated SVD) of ``X``, ``f`` needs to be explicitly writen in ``Tuple`` form.
107108

108-
Keyword arguments
109+
Keyword arguments
109110

110111
``tol_nmf``: the tolerance of NMF polishing step, default: $10^{-4}$
111112

@@ -140,9 +141,5 @@ Arguments:
140141
-----
141142

142143
## Citation
143-
The code is welcomed to be used in your publication, please cite:
144-
145-
146-
147-
148144

145+
If you find this package useful please cite:

src/GsvdInitialization.jl

Lines changed: 46 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -7,34 +7,32 @@ export gsvdnmf,
77
gsvdrecover
88

99
"""
10-
W, H = **gsvdnmf**(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
11-
n2 = size(first(f), 2),
12-
tol_nmf=1e-4,
13-
kwargs...)
10+
W, H = gsvdnmf(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
11+
n2 = size(first(f), 2),
12+
tol_nmf=1e-4,
13+
kwargs...)
1414
15-
This funtion augments components for ``W`` and ``H``, and subsequently polishs new ``W`` and ``H`` by NMF.
15+
Augment `W` and `H` to have `n2` components, subsequently polished by NMF.
1616
17-
Arguments:
17+
Arguments:
1818
19-
``X``: non-nagetive 2D data matrix
19+
- `X`: non-negative data matrix
2020
21-
``W``: initialization of initial NMF
21+
- `W` and `H`: initial NMF factorization
2222
23-
``H``: initialization of initial NMF
23+
- `n2`: the number of components in augmented factorization
2424
25-
``n2``: the number of components in augmented matrix
25+
- `f`: SVD (or Truncated SVD) of `X`
2626
27-
``f``: SVD (or Truncated SVD) of ``X``, ``f`` needs to be explicitly writen in ``Tuple`` form.
27+
Keyword arguments:
2828
29-
Keyword arguments
29+
- `tol_nmf`: the tolerance of NMF polishing step, default: 1e-4
3030
31-
``tol_nmf``: the tolerance of NMF polishing step, default: 1e-4
32-
33-
Other keyword arguments are passed to ``NMF.nnmf``.
31+
Other keyword arguments are passed to `NMF.nnmf`.
3432
"""
35-
function gsvdnmf(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
36-
n2 = size(first(f), 2),
37-
tol_nmf=1e-4,
33+
function gsvdnmf(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, f;
34+
n2 = size(first(f), 2),
35+
tol_nmf=1e-4,
3836
kwargs...)
3937
n1 = size(W, 2)
4038
kadd = n2 - n1
@@ -52,28 +50,30 @@ end
5250
gsvdnmf(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix, n2::Int; kwargs...) = gsvdnmf(X, W, H, tsvd(X, n2); kwargs...)
5351

5452
"""
55-
W, H = **gsvdnmf**(X::AbstractMatrix, ncomponents::Pair{Int,Int}; tol_final=1e-4, tol_intermediate=1e-4, kwargs...)
53+
W, H = gsvdnmf(X::AbstractMatrix, ncomponents::Pair{Int,Int}; tol_final=1e-4, tol_intermediate=1e-4, kwargs...)
5654
57-
This function performs "GSVD-NMF" on 2D data matrix ``X``.
55+
Perform "GSVD-NMF" on the data matrix `X`.
5856
59-
Arguments:
57+
Arguments:
6058
61-
``X``: non-nagetive 2D data matrix
59+
- `X`: non-negative data matrix
6260
63-
``ncomponents::Pair{Int,Int}``: in the form of ``n1 => n2``, augments from ``n1`` components to ``n2``components, where ``n1`` is the number of components for initial NMF (under-complete NMF), and ``n2`` is the number of components for final NMF.
61+
- `ncomponents`: in the form of `n1 => n2`, augments from `n1` components to `n2`components,
62+
where `n1` is the number of components for initial NMF (under-complete NMF), and `n2` is the number of
63+
components for final NMF.
6464
65-
Alternatively, ``ncomponents`` can be an integer denoting the number of components for final NMF.
66-
In this case, ``gsvdnmf`` defaults to augment components on initial NMF solution by 1.
65+
Alternatively, `ncomponents` can be an integer denoting the number of components for final NMF.
66+
In this case, `gsvdnmf` defaults to augment components on initial NMF solution by 1.
6767
68-
Keyword arguments:
68+
Keyword arguments:
6969
70-
``tol_final``: The tolerence of final NMF, default:``10^{-4}``
70+
- `tol_final`: The tolerence of final NMF, default:`10^{-4}`
7171
72-
``tol_intermediate``: The tolerence of initial NMF (under-complete NMF), default: tol_final
72+
- `tol_intermediate`: The tolerence of initial NMF (under-complete NMF), default: tol_final
7373
74-
Other keyword arguments are passed to ``NMF.nnmf``.
74+
Other keyword arguments are passed to `NMF.nnmf`.
7575
"""
76-
function gsvdnmf(X::AbstractMatrix, ncomponents::Pair{Int,Int}; tol_final=1e-4, tol_intermediate=1e-4, kwargs...)
76+
function gsvdnmf(X::AbstractMatrix, ncomponents::Pair{Int,Int}; tol_final=1e-4, tol_intermediate=tol_final, kwargs...)
7777
n1, n2 = ncomponents
7878
f = tsvd(X, n2)
7979
W0, H0 = NMF.nndsvd(X, n1; initdata = (U = f[1], S = f[2], V = f[3]))
@@ -84,37 +84,38 @@ end
8484
gsvdnmf(X::AbstractMatrix, ncomponents_final::Integer; kwargs...) = gsvdnmf(X, ncomponents_final-1 => ncomponents_final; kwargs...)
8585

8686
"""
87-
Wadd, Hadd, S = **gsvdrecover**(X, W0, H0, kadd, f)
87+
Wadd, Hadd, S = gsvdrecover(X, W0, H0, kadd, f)
8888
89-
This funtion augments components for ``W`` and ``H`` without polishing NMF step.
89+
Augment components for `W` and `H` without polishing by NMF.
9090
91-
Outputs:
91+
Outputs:
9292
93-
``Wadd``: augmented NMF solution
93+
`Wadd`: augmented NMF solution
9494
95-
``Hadd``: augmented NMF solution
95+
`Hadd`: augmented NMF solution
9696
97-
``S``: related generalized singular value
97+
`S`: generalized singular values for the `kadd` augmented components
9898
99-
Arguments:
99+
Arguments:
100100
101-
``X``: non-nagetive 2D data matrix
101+
`X`: non-nagetive 2D data matrix
102102
103-
``W0``: NMF solution
103+
`W0`: NMF solution
104104
105-
``H0``: NMF solution
105+
`H0`: NMF solution
106106
107-
``kadd``: number of new components
107+
`kadd`: number of new components
108108
109-
``f``: SVD (or Truncated SVD) of ``X``, ``f`` needs to be indexable.
109+
`f`: SVD (or Truncated SVD) of `X`
110110
"""
111111
function gsvdrecover(X::AbstractArray, W0::AbstractArray, H0::AbstractArray, kadd::Int, f::Tuple)
112112
m, n = size(W0)
113113
kadd <= n || throw(ArgumentError("# of extra columns must less than 1st NMF components"))
114114
if kadd == 0
115115
return W0, H0, 0
116116
else
117-
U0, S0, V0 = f[1][:,1:n], f[2][1:n], f[3][:,1:n]
117+
U0, S0, V0 = f
118+
U0, S0, V0 = U0[:,1:n], S0[1:n], V0[:,1:n]
118119
Hadd, Λ = init_H(U0, S0, V0, W0, H0, kadd)
119120
Wadd, a = init_W(X, W0, H0, Hadd)
120121
Wadd_nn, Hadd_nn = NMF.nndsvd(X, kadd, initdata = (U = Wadd, S = ones(kadd), V = Hadd'))
@@ -167,11 +168,11 @@ function Wcols_modification(X::AbstractArray{T}, W::AbstractArray{T}, H::Abstrac
167168
n = size(W, 2)
168169
a = Array{T}(undef, n)
169170
B = Array{T}(undef, n, n)
170-
WW, HH = W'*W, H*H'
171+
WW, HH = W'*W, H*H'
171172
WtXHt = W'*X*H'
172173
a = diag(WtXHt)
173174
B = WW.*HH
174-
β = nonneg_lsq(B, a; alg=:fnnls, gram=true)
175+
β = nonneg_lsq(B, a; alg=:fnnls, gram=true)
175176
return β[:]
176177
end
177178

0 commit comments

Comments
 (0)