rikenbit
diff --git a/‎Project.toml‎
Lines changed: 1 addition & 1 deletion b/‎Project.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎paper/paper.bib‎
Lines changed: 1 addition & 0 deletions b/‎paper/paper.bib‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎paper/paper.md‎
Lines changed: 40 additions & 40 deletions b/‎paper/paper.md‎
Lines changed: 40 additions & 40 deletions
diff --git a/‎paper/paper.pdf‎
38 Bytes b/‎paper/paper.pdf‎
38 Bytes
diff --git a/‎src/bincoo_dnmf.jl‎
Lines changed: 17 additions & 1 deletion b/‎src/bincoo_dnmf.jl‎
Lines changed: 17 additions & 1 deletion
diff --git a/‎src/bincoo_nmf.jl‎
Lines changed: 44 additions & 1 deletion b/‎src/bincoo_nmf.jl‎
Lines changed: 44 additions & 1 deletion
diff --git a/‎src/dnmf.jl‎
Lines changed: 17 additions & 1 deletion b/‎src/dnmf.jl‎
Lines changed: 17 additions & 1 deletion
diff --git a/‎src/nmf.jl‎
Lines changed: 44 additions & 1 deletion b/‎src/nmf.jl‎
Lines changed: 44 additions & 1 deletion
@@ -2,7 +2,7 @@ name = "OnlineNMF"
 uuid = "e0c94d91-d830-4516-8b46-9a113d37a394"
 license = "MIT"
 authors = ["kokitsuyuzaki <koki.tsuyuzaki@gmail.com>"]
-version = "0.99.6"
+version = "0.99.7"
 
 [deps]
 ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"
 
@@ -10,6 +10,8 @@ Online Non-negative Matrix Factorization
 ## Description
 OnlineNMF.jl performs some online-NMF functions for extreamly large scale matrix.
 
+__Note: The input matrix is supposed to be a non-negative matrix.__
+
 ## Algorithms
 
 - Multiplicative Update (MU)
@@ -26,14 +28,12 @@ OnlineNMF.jl performs some online-NMF functions for extreamly large scale matrix
 ## Installation
 <!-- ```julia
 julia> using Pkg
-julia> Pkg.add(url="https://github.com/rikenbit/OnlinePCA.jl.git")
 julia> Pkg.add(url="https://github.com/rikenbit/OnlineNMF.jl.git")
 julia> Pkg.add("PlotlyJS")
 ```
  -->
 ```julia
 # push the key "]" and type the following command.
-(@julia) pkg> add https://github.com/rikenbit/OnlinePCA.jl
 (@julia) pkg> add https://github.com/rikenbit/OnlineNMF.jl
 (@julia) pkg> add PlotlyJS
 # After that, push Ctrl + C to leave from Pkg REPL mode
 
@@ -117,5 +117,6 @@ @misc{rcppplanc
   title        = {{RcppPlanc}: R wrapper for the PLANC Nonnegative Matrix Factorization library},
   year         = {2023},
   howpublished = {\url{https://github.com/welch-lab/RcppPlanc}},
+  doi = {10.32614/CRAN.package.RcppPlanc},
   note         = {Accessed: 2025-05-01}
 }
@@ -15,7 +15,7 @@ affiliations:
    index: 1
  - name: Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research, Japan
    index: 2
-date: 8 July 2025
+date: 3 December 2025
 bibliography: paper.bib
 ---
 
@@ -27,7 +27,7 @@ Despite its broad applicability, NMF becomes computationally prohibitive for lar
 
 # Statement of need
 
-NMF is a workhorse algorithm for most data science tasks. However, as the size of the data matrix increases, it often becomes too large to fit into memory. In such cases, an out-of-core (OOC) implementation — where only subsets of data stored on disk are loaded into memory for computation — is desirable. Additionally, representing the data in a sparse matrix format, where only non-zero values and their coordinates are stored, is computationally advantageous. Therefore, a NMF implementation that supports both OOC computation and sparse data handling is highly desirable.
+NMF is a workhorse algorithm for most data science tasks. However, as the size of the data matrix increases, it often becomes too large to fit into memory. In such cases, an out-of-core (OOC) implementation — where only subsets of data stored on disk are loaded into memory for computation — is desirable. Additionally, representing the data in a sparse matrix format, where only non-zero values and their coordinates are stored, is computationally advantageous. Therefore, a NMF implementation that supports both OOC computation and sparse data handling is highly desirable (Figure 1).
 
 Similar discussions have been made in the context of Principal Component Analysis (PCA), and we have independently developed a Julia package, \texttt{OnlinePCA.jl} [@onlinepcajl]. \texttt{OnlineNMF.jl} is a spin-off version of \texttt{OnlinePCA.jl}, implementing NMF.
 
@@ -39,21 +39,19 @@ NMF can be easily reproduced on any machine where Julia is pre-installed by usin
 
 ## Installation
 
-First, install \texttt{OnlinePCA.jl} and \texttt{OnlineNMF.jl} from the official Julia package registry or directly from GitHub:
+First, install \texttt{OnlineNMF.jl} from the official Julia package registry or directly from GitHub:
 
 ```julia
-# Install OnlinePCA.jl and OnlineNMF.jl from Julia General
-julia> Pkg.add("OnlinePCA")
+# Install OnlineNMF.jl from Julia General
 julia> Pkg.add("OnlineNMF")
 
 # or GitHub for the latest version
-julia> Pkg.add(url="https://github.com/rikenbit/OnlinePCA.jl.git")
 julia> Pkg.add(url="https://github.com/rikenbit/OnlineNMF.jl.git")
 ```
 
 ## Preprocess of CSV
 
-Then, write a synthetic data as a CSV file, convert it to a compressed binary format using Zstandard, and prepare summary statistics for PCA. MM format is also supported for sparse matrices.
+Then, write a synthetic data as a CSV file, convert it to a compressed binary format using Zstandard, and prepare summary statistics for PCA. Matrix Market (MM) format is also supported for sparse matrices.
 
 ```julia
 using OnlinePCA
@@ -76,12 +74,10 @@ write_csv(joinpath(tmp, "Data.csv"), data)
 mmwrite(joinpath(tmp, "Data.mtx"), sparse(data))
 
 # Binarization (Zstandard)
-csv2bin(csvfile=joinpath(tmp, "Data.csv"),
-    binfile=joinpath(tmp, "Data.zst"))
+csv2bin(csvfile=joinpath(tmp, "Data.csv"), binfile=joinpath(tmp, "Data.zst"))
 
 # Sparsification (Zstandard + MM format)
-mm2bin(mmfile=joinpath(tmp, "Data.mtx"),
-    binfile=joinpath(tmp, "Data.mtx.zst"))
+mm2bin(mmfile=joinpath(tmp, "Data.mtx"), binfile=joinpath(tmp, "Data.mtx.zst"))
 ```
 
 ## Setting for plot
@@ -92,41 +88,44 @@ Define a helper function to visualize the results of NMF using the \texttt{Plotl
 using DataFrames
 using PlotlyJS
 
-function subplots(resnmf, group)
-  # data frame
-  data_left = DataFrame(pc1=resnmf[:,1], pc2=resnmf[:,2], group=group)
-  data_right = DataFrame(pc2=resnmf[:,2], pc3=resnmf[:,3], group=group)
-  # plot
-  p_left = Plot(data_left, x=:nmf1, y=:nmf2, mode="markers",
+function subplots(out_nmf, group)
+	# data frame
+	data_left = DataFrame(nmf1=out_nmf[1][:,1], nmf2=out_nmf[1][:,2],
+      group=group)
+	data_right = DataFrame(nmf2=out_nmf[1][:,2], nmf3=out_nmf[1][:,3],
+      group=group)
+	# plot
+	p_left = Plot(data_left, x=:nmf1, y=:nmf2, mode="markers",
       marker_size=10, group=:group)
-  p_right = Plot(data_right, x=:nmf2, y=:nmf3, mode="markers",
+	p_right = Plot(data_right, x=:nmf2, y=:nmf3, mode="markers",
       marker_size=10,
-  group=:group, showlegend=false)
-  p_left.data[1]["marker_color"] = "red"
-  p_left.data[2]["marker_color"] = "blue"
-  p_left.data[3]["marker_color"] = "green"
-  p_right.data[1]["marker_color"] = "red"
-  p_right.data[2]["marker_color"] = "blue"
-  p_right.data[3]["marker_color"] = "green"
-  p_left.data[1]["name"] = "group1"
-  p_left.data[2]["name"] = "group2"
-  p_left.data[3]["name"] = "group3"
-  p_left.layout["title"] = "PC1 vs PC2"
-  p_right.layout["title"] = "PC2 vs PC3"
-  p_left.layout["xaxis_title"] = "pc1"
-  p_left.layout["yaxis_title"] = "pc2"
-  p_right.layout["xaxis_title"] = "pc2"
-  p_right.layout["yaxis_title"] = "pc3"
-  plot([p_left p_right])
+	group=:group, showlegend=false)
+	p_left.data[1]["marker_color"] = "red"
+	p_left.data[2]["marker_color"] = "blue"
+	p_left.data[3]["marker_color"] = "green"
+	p_right.data[1]["marker_color"] = "red"
+	p_right.data[2]["marker_color"] = "blue"
+	p_right.data[3]["marker_color"] = "green"
+	p_left.data[1]["name"] = "group1"
+	p_left.data[2]["name"] = "group2"
+	p_left.data[3]["name"] = "group3"
+	p_left.layout["title"] = "Component 1 vs Component 2"
+	p_right.layout["title"] = "Component 2 vs Component 3"
+	p_left.layout["xaxis_title"] = "nmf-1"
+	p_left.layout["yaxis_title"] = "nmf-2"
+	p_right.layout["xaxis_title"] = "nmf-2"
+	p_right.layout["yaxis_title"] = "nmf-3"
+	plot([p_left p_right])
 end
 
-group=vcat(repeat(["group1"],inner=33), repeat(["group2"],inner=33),
-    repeat(["group3"],inner=33))
+group=vcat(repeat(["group1"],inner=100),
+    repeat(["group2"],inner=100),
+    repeat(["group3"],inner=100))
 ```
 
 ## NMF based on Alpha-Divergence
 
-This example demonstrates NMF using the $\alpha$-divergence as the loss function. By setting alpha=2, the objective corresponds to the Pearson divergence. The input data is assumed to be a dense matrix compressed with Zstandard (.zst format).
+This example demonstrates NMF using the $\alpha$-divergence as the loss function (Figure 2). By setting alpha=2, the objective corresponds to the Pearson divergence. The input data is assumed to be a dense matrix compressed with Zstandard (.zst format).
 
 ```julia
 out_nmf_alpha = nmf(input=joinpath(tmp, "Data.zst"),
@@ -139,7 +138,7 @@ subplots(out_nmf_alpha, group)
 
 ## Sparse-NMF based on Beta-Divergence
 
-This example performs NMF on a sparse matrix using the $\beta$-divergence. The input is a MM formatted sparse matrix file (.mtx.zst). When beta=1, the loss corresponds to the Kullback-Leibler divergence, and sparse-specific optimization is used internally.
+This example performs NMF on a sparse matrix using the $\beta$-divergence (Figure 3). The input is a MM formatted sparse matrix file (.mtx.zst). When beta=1, the loss corresponds to the Kullback-Leibler divergence, and sparse-specific optimization is used internally.
 
 ```julia
 out_sparse_nmf_beta = sparse_nmf(input=joinpath(tmp, "Data.mtx.zst"),
@@ -153,7 +152,7 @@ subplots(out_sparse_nmf_beta, group)
 
 # Related work
 
-There are various implementations of NMF [@nntensor; @sklearn; @nmfk] and some of them are OOC-type or sparse-type [@sklearn; @rcppplanc] but \texttt{OnlineNMF.jl} is the only tool that supports both OOC computation and sparse data formats (e.g., MM, BinCOO).
+There are various implementations of NMF [@nntensor; @sklearn; @nmfk] and some of them support OOC computation or sparse data formats [@sklearn; @rcppplanc]. While \texttt{RcppPlanc/PLANC} supports both OOC and R's internal sparse format (dgCMatrix), \texttt{OnlineNMF.jl} is designed to handle language-agnostic sparse formats such as MM and Binary COO (BinCOO), enabling seamless integration with external data pipelines.
 
 | Function Name | Language | OOC | Sparse Format |
 |:------ | :----: | :----: | :----: |
@@ -163,5 +162,6 @@ There are various implementations of NMF [@nntensor; @sklearn; @nmfk] and some o
 | \texttt{NMF.MultUpdate} | Julia | No | - |
 | \texttt{sklearn.decomposition.MiniBatchNMF} | Python | Yes | - |
 | \texttt{RcppPlanc/PLANC} | R/C++ | Yes | dgCMatrix |
+| \texttt{OnlineNMF.jl} | Julia | Yes | MM/BinCOO |
 
 # References
@@ -341,13 +341,29 @@ function init_bincoo_dnmf(
     lower::Number,
     upper::Number
 )
-    # Type Check
+    # Initialization
     N, M = nm(input)
     binu = convert(Float32, binu)
     binv = convert(Float32, binv)
     teru = convert(Float32, teru)
     terv = convert(Float32, terv)
     graphv = convert(Float32, graphv)
+    # Check non-negative parameters
+    if binu < 0
+        throw(ArgumentError("binu must be non-negative, got $binu"))
+    end
+    if binv < 0
+        throw(ArgumentError("binv must be non-negative, got $binv"))
+    end
+    if teru < 0
+        throw(ArgumentError("teru must be non-negative, got $teru"))
+    end
+    if terv < 0
+        throw(ArgumentError("terv must be non-negative, got $terv"))
+    end
+    if graphv < 0
+        throw(ArgumentError("graphv must be non-negative, got $graphv"))
+    end
     # Initialization by BinCOO-NMF
     out_nmf = bincoo_nmf(
         input=input,
 
@@ -479,7 +479,7 @@ function init_bincoo_nmf(
     lower::Number,
     upper::Number
 )
-    # Type Check
+    # Initialization
     N, M = nm(input)
     alpha = convert(Float32, alpha)
     beta = convert(Float32, beta)
@@ -493,6 +493,49 @@ function init_bincoo_nmf(
     chunksize = convert(Int64, chunksize)
     lower = convert(Float32, lower)
     upper = convert(Float32, upper)
+    # Argument Check
+    # Check matrix dimensions (single row/column is meaningless for NMF)
+    if N == 1
+        throw(ArgumentError("Input matrix has only 1 row. NMF requires at least 2 rows."))
+    end
+    if M == 1
+        throw(ArgumentError("Input matrix has only 1 column. NMF requires at least 2 columns."))
+    end
+    # Check non-negative parameters
+    if graphv < 0
+        throw(ArgumentError("graphv must be non-negative, got $graphv"))
+    end
+    if l1u < 0
+        throw(ArgumentError("l1u must be non-negative, got $l1u"))
+    end
+    if l1v < 0
+        throw(ArgumentError("l1v must be non-negative, got $l1v"))
+    end
+    if l2u < 0
+        throw(ArgumentError("l2u must be non-negative, got $l2u"))
+    end
+    if l2v < 0
+        throw(ArgumentError("l2v must be non-negative, got $l2v"))
+    end
+    if dim < 1
+        throw(ArgumentError("dim must be positive, got $dim"))
+    end
+    if numepoch < 1
+        throw(ArgumentError("numepoch must be positive, got $numepoch"))
+    end
+    if chunksize < 1
+        throw(ArgumentError("chunksize must be positive, got $chunksize"))
+    end
+    if lower < 0
+        throw(ArgumentError("lower must be non-negative, got $lower"))
+    end
+    if upper < 0
+        throw(ArgumentError("upper must be non-negative, got $upper"))
+    end
+    # Check dim vs matrix size
+    if min(N, M) < dim
+        throw(ArgumentError("dim ($dim) must be <= min(N, M) = $(min(N, M))"))
+    end
     # Initialization of U and V
     U = load_or_random(initU, N, dim)
     V = load_or_random(initV, M, dim)
 
@@ -306,13 +306,29 @@ function init_dnmf(
     lower::Number,
     upper::Number
 )
-    # Type Check
+    # Initizalization
     N, M = nm(input)
     binu = convert(Float32, binu)
     binv = convert(Float32, binv)
     teru = convert(Float32, teru)
     terv = convert(Float32, terv)
     graphv = convert(Float32, graphv)
+    # Check non-negative parameters
+    if binu < 0
+        throw(ArgumentError("binu must be non-negative, got $binu"))
+    end
+    if binv < 0
+        throw(ArgumentError("binv must be non-negative, got $binv"))
+    end
+    if teru < 0
+        throw(ArgumentError("teru must be non-negative, got $teru"))
+    end
+    if terv < 0
+        throw(ArgumentError("terv must be non-negative, got $terv"))
+    end
+    if graphv < 0
+        throw(ArgumentError("graphv must be non-negative, got $graphv"))
+    end
     # Initialization by NMF
     out_nmf = nmf(
         input=input,
 
@@ -381,7 +381,7 @@ function init_nmf(
     lower::Number,
     upper::Number
 )
-    # Type Check
+    # Initialization
     N, M = nm(input)
     alpha = convert(Float32, alpha)
     beta = convert(Float32, beta)
@@ -395,6 +395,49 @@ function init_nmf(
     chunksize = convert(Int64, chunksize)
     lower = convert(Float32, lower)
     upper = convert(Float32, upper)
+    # Argument Check
+    # Check matrix dimensions (single row/column is meaningless for NMF)
+    if N == 1
+        throw(ArgumentError("Input matrix has only 1 row. NMF requires at least 2 rows."))
+    end
+    if M == 1
+        throw(ArgumentError("Input matrix has only 1 column. NMF requires at least 2 columns."))
+    end
+    # Check non-negative parameters
+    if graphv < 0
+        throw(ArgumentError("graphv must be non-negative, got $graphv"))
+    end
+    if l1u < 0
+        throw(ArgumentError("l1u must be non-negative, got $l1u"))
+    end
+    if l1v < 0
+        throw(ArgumentError("l1v must be non-negative, got $l1v"))
+    end
+    if l2u < 0
+        throw(ArgumentError("l2u must be non-negative, got $l2u"))
+    end
+    if l2v < 0
+        throw(ArgumentError("l2v must be non-negative, got $l2v"))
+    end
+    if dim < 1
+        throw(ArgumentError("dim must be positive, got $dim"))
+    end
+    if numepoch < 1
+        throw(ArgumentError("numepoch must be positive, got $numepoch"))
+    end
+    if chunksize < 1
+        throw(ArgumentError("chunksize must be positive, got $chunksize"))
+    end
+    if lower < 0
+        throw(ArgumentError("lower must be non-negative, got $lower"))
+    end
+    if upper < 0
+        throw(ArgumentError("upper must be non-negative, got $upper"))
+    end
+    # Check dim vs matrix size
+    if min(N, M) < dim
+        throw(ArgumentError("dim ($dim) must be <= min(N, M) = $(min(N, M))"))
+    end
     # Initialization of U and V
     U = load_or_random(initU, N, dim)
     V = load_or_random(initV, M, dim)
Original file line number	Diff line number	Diff line change
`@@ -117,5 +117,6 @@ @misc{rcppplanc`
`117`	`117`	`title = {{RcppPlanc}: R wrapper for the PLANC Nonnegative Matrix Factorization library},`
`118`	`118`	`year = {2023},`
`119`	`119`	`howpublished = {\url{https://github.com/welch-lab/RcppPlanc}},`
	`120`	`+ doi = {10.32614/CRAN.package.RcppPlanc},`
`120`	`121`	`note = {Accessed: 2025-05-01}`
`121`	`122`	`}`