You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ Thanks,
13
13
|------|------|--------|
14
14
|[A-priori and SON](https://github.com/Cheng-Lin-Li/Spark/tree/master/A-Priori_SON)| Finding Frequent Itemsets: SON Algorithm by A-Priori algorithm in stage 1. The implementation include Savasere, Omiecinski, and Navathe (SON) algorithm as a class and an A-Priori algorithm in python class encapsulates all functions which implement by static functions to support Spark RDD to call. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/A-Priori_SON/A-Priori_SON.py)|
15
15
|[ALS with UV Decomposition](https://github.com/Cheng-Lin-Li/Spark/tree/master/ALS)|An implementation of UV Decomposition in Alternating Least Squares (ALS) Algorithm by Spark. The task is to modify the parallel implementation of ALS (alternating least squares) algorithm in Spark, so that it takes a utility matrix as the input and process by UV decomposition, and output the root-mean-square deviation (RMSE) into standard output or a file after each iteration. The code for the algorithm is als.py under the <spark-2.1.0 installation directory>/examples/src/main/python.|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/ALS/ALS.py)|
16
-
|[TF-IDF wiht K-Means](https://github.com/Cheng-Lin-Li/Spark/tree/master/TF-IDF_KMeans)|A similarity algorithm implementation of TF-IDF algorithm with cosin similarity implementation on spark platform as the measure of K-Means. The implementation of k-means is provided by Spark in examples/src/main/python/ml/kmeans_example.py. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/TF-IDF_KMeans/kmeans.py)|
16
+
|[TF-IDF with K-Means](https://github.com/Cheng-Lin-Li/Spark/tree/master/TF-IDF_KMeans)|A similarity algorithm implementation of TF-IDF algorithm with cosin similarity implementation on spark platform as the measure of K-Means. The implementation of k-means is provided by Spark in examples/src/main/python/ml/kmeans_example.py. |[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/TF-IDF_KMeans/kmeans.py)|
17
17
|[Matrix Multiplication by Two Phases approach](https://github.com/Cheng-Lin-Li/Spark/tree/master/Matrix_Multiplication)|Matrix Multiplication: Two Phases approach to deal with huge matrix multiplication on spark platform|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/Matrix_Multiplication/TwoPhase_Matrix_Multiplication.py)|
18
18
|[Minhash and Locality-Sensitive Hash (LSH)](https://github.com/Cheng-Lin-Li/Spark/tree/master/MinHash_LSH)|An implementation of MinHash and LSH to find similar set/users from their items/movies preference data. The implementation is finding similar sets/users by minhash and LSH in Spark platform to speed up the calculation - calculating the similarity by Jaccard similarity (or Jaccard coefficient). LSH: The implementation of Locality-Sensitive Hash in Spark. Based on Minhash functions to get the signature for each set/users and split these minhash functions by band. Each band will contain R minhash functions results|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/MinHash_LSH/lshrec.py)|
19
19
|[UV decomposition](https://github.com/Cheng-Lin-Li/Spark/tree/master/UV_decomposition)| An implementation of UV decomposition algorithm. The implementation goal is to decompose user/product ratings matrix M into lower-rank matrices U and V such that the difference between M and UV is minimized. Root-mean-square error (RMSE) is adopted to measure the quality of decomposition|[Source Code](https://github.com/Cheng-Lin-Li/Spark/blob/master/UV_decomposition/UV.py)|
0 commit comments