Skip to content

Commit 4dfc6a5

Browse files
authored
Update README.md
1 parent 9852979 commit 4dfc6a5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Thanks,
1313
|------|------|--------|
1414
|[A-priori and SON](https://github.com/Cheng-Lin-Li/Spark/tree/master/A-Priori_SON)| Finding Frequent Itemsets: SON Algorithm by A-Priori algorithm in stage 1. The implementation include Savasere, Omiecinski, and Navathe (SON) algorithm as a class and an A-Priori algorithm in python class encapsulates all functions which implement by static functions to support Spark RDD to call. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/A-Priori_SON/A-Priori_SON.py)|
1515
|[ALS with UV Decomposition](https://github.com/Cheng-Lin-Li/Spark/tree/master/ALS)|An implementation of UV Decomposition in Alternating Least Squares (ALS) Algorithm by Spark. The task is to modify the parallel implementation of ALS (alternating least squares) algorithm in Spark, so that it takes a utility matrix as the input and process by UV decomposition, and output the root-mean-square deviation (RMSE) into standard output or a file after each iteration. The code for the algorithm is als.py under the <spark-2.1.0 installation directory>/examples/src/main/python.|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/ALS/ALS.py)|
16-
|[TF-IDF wiht K-Means](https://github.com/Cheng-Lin-Li/Spark/tree/master/TF-IDF_KMeans)|A similarity algorithm implementation of TF-IDF algorithm with cosin similarity implementation on spark platform as the measure of K-Means. The implementation of k-means is provided by Spark in examples/src/main/python/ml/kmeans_example.py. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/TF-IDF_KMeans/kmeans.py)|
16+
|[TF-IDF with K-Means](https://github.com/Cheng-Lin-Li/Spark/tree/master/TF-IDF_KMeans)|A similarity algorithm implementation of TF-IDF algorithm with cosin similarity implementation on spark platform as the measure of K-Means. The implementation of k-means is provided by Spark in examples/src/main/python/ml/kmeans_example.py. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/TF-IDF_KMeans/kmeans.py)|
1717
|[Matrix Multiplication by Two Phases approach](https://github.com/Cheng-Lin-Li/Spark/tree/master/Matrix_Multiplication)|Matrix Multiplication: Two Phases approach to deal with huge matrix multiplication on spark platform|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/Matrix_Multiplication/TwoPhase_Matrix_Multiplication.py)|
1818
|[Minhash and Locality-Sensitive Hash (LSH)](https://github.com/Cheng-Lin-Li/Spark/tree/master/MinHash_LSH)|An implementation of MinHash and LSH to find similar set/users from their items/movies preference data. The implementation is finding similar sets/users by minhash and LSH in Spark platform to speed up the calculation - calculating the similarity by Jaccard similarity (or Jaccard coefficient). LSH: The implementation of Locality-Sensitive Hash in Spark. Based on Minhash functions to get the signature for each set/users and split these minhash functions by band. Each band will contain R minhash functions results|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/MinHash_LSH/lshrec.py)|
1919
|[UV decomposition](https://github.com/Cheng-Lin-Li/Spark/tree/master/UV_decomposition)| An implementation of UV decomposition algorithm. The implementation goal is to decompose user/product ratings matrix M into lower-rank matrices U and V such that the difference between M and UV is minimized. Root-mean-square error (RMSE) is adopted to measure the quality of decomposition| [Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/UV_decomposition/UV.py)|

0 commit comments

Comments
 (0)