An intuitive class that represents a Sparse Matrix by storing non-continuous tuple/data nodes and tuple connections in std::vectors rather than pointers. A SparseMatrix iterator is provided for looping through the tuples.
See below the API usage of the Sparse Matrix.
Built with C++ GCC 15.2.0 under Windows, could be easily altered for Linux, it does not use any windows API call.
#To compile and run main:
cd src
make compile
./a.exe
#include "SparseMatrix/SparseMatrix.h"
...
SparseMatrix A;
int I = 100;
int J = 100;
int K = 100;
int stride = 3;
int aSize = 0;
for (int i = I / 2; i < I; i+=stride) {
for (int j = J / 2; j < J; j++) {
for (int k = K / 2; k < K; k++) {
//if there was a success, insert returns true
assert ( A.insert({i, j, k}, (i + j + k + 1)) );
aSize ++;
}
}
}
//getValue, if the tuple doesn't exists, the method returns 0.0f
assert(A.getValue({1,1,1}) == 0.0f);
assert(A.getValue({I/2,J/2,K/2}) == 151.0f);
//size()
assert(A.size() == aSize);
//erase will return false if the tuple is not present in the Sparse Matrix
assert(A.erase({0,0,0}) == false);
//erase will return true if the tuple is present
assert(A.erase({I/2,J/2,K/2}) == true);
//clear will clean the internal state of the Sparse Matrix
A.clear();
assert ( A.size() == 0);
SparseMatrix B;
//B insertions ...
//unequality operator for the Sparse Matrix objects
assert ( A != B );
//equality operator for the Sparse Matrix objects
assert ( !(A == B) );
//multiplication and assignment
Sparse D = A * B;
#include "SparseMatrix/SparseMatrix.h"
...
A.insert({10, 5, 2}, 4);
A.insert({10, 5, 3}, 5);
A.insert({10, 5, 4}, 6);
A.insert({10, 5, 5}, 7);
A.insert({10, 5, 6}, 8);
A.insert({1, 1, 10}, 2);
A.insert({1, 2, 3}, 3);
A.insert({1, 1, 1}, 1);
A.insert({10, 6, 1}, 9);
A.insert({1, 3, 1}, 10);
SparseMatrixIterator iterator = A.iterator();
for (const SparseMatrixTuple& tuple : iterator) {
tuple.dump();
}
/* The output of the ranged for-loop will be sorted by tuple index as follows:
{1, 1, 1}:= 1
{1, 1, 10}:= 2
{1, 2, 3}:= 3
{1, 3, 1}:= 10
{10, 5, 2}:= 4
{10, 5, 3}:= 5
{10, 5, 4}:= 6
{10, 5, 5}:= 7
{10, 5, 6}:= 8
{10, 6, 1}:= 9
*/
Used a flatten generic tree implementation approach, where the Sparse Matrix uses two Arrays for the information representation; a Children array and a Node array. We used arrays in order to not use pointer logic where an object can be stored anywhere in memory (pointer chasing) while by using arrays we utilize contiguous memory and obtain better cache locality.
Yes, it uses an atomic flag for test and set semantics, locking the data structure allowing insertions/deletions by one thread at a time. In the todos, a lock-free implementation of the Sparse Matrix will be designed (or adjust the current design).
The Sparse Matrix uses two Arrays to maintain its state and values; the Children and the Nodes arrays. We use the nodes to hold values and tuple hop information, and the children to hold the tuple keys and node hop information. Thus, every new tuple (with 3 dimensions) will create maximum 3 new children and 3 new nodes to maintain its information. It is important to mention that only the Children array gets resorted based on the tuple keys that are inserted; if the tuple {5,2,3} is inserted, and then the tuple {5,1,3} is inserted, the tuple key that holds the information for child "1" will be inserted before child that holds information for "2".
The following scenarios will show you the insertion process; where all changes in the existing registrations are showed in "bold".
I. Initially:
| Children | ||
|---|---|---|
| I | Tuple Index | Node Index |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 0 | 0 |
II. Insert: {1,5,6} = 11
| Children | ||
|---|---|---|
| I | Tuple Index | Node Index |
| 0 | 1 | 1 |
| 1 | 5 | 2 |
| 2 | 6 | 3 |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 1 | 0 | |
| 1 | 1 | 1 | 0 | |
| 2 | 2 | 1 | 0 | |
| 3 | -1 | 0 | 1 | 11 |
Depicted Tuples: {1,5,6} = 11
III. Insert: {6,5,6} = 13
| Children | |||
|---|---|---|---|
| I | Tuple Index | Node Index | |
| 0 | 1 | 1 | |
| 1 | 6 | 4 | <-INSERT |
| 2 | 5 | 2 | |
| 3 | 6 | 3 | |
| 4 | 5 | 5 | |
| 5 | 6 | 6 |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 2 | 0 | |
| 1 | 2 | 1 | 0 | |
| 2 | 3 | 1 | 0 | |
| 3 | -1 | 0 | 1 | 11 |
| 4 | 4 | 1 | 0 | |
| 5 | 5 | 1 | 0 | |
| 6 | -1 | 0 | 1 | 13 |
Depicted Tuples: {1,5,6} = 11, {6,5,6} = 13
IV. Insert: {4,5,6} = 133
| Children | |||
|---|---|---|---|
| I | Tuple Index | Node Index | |
| 0 | 1 | 1 | |
| 1 | 4 | 7 | <-INSERT |
| 2 | 6 | 4 | |
| 3 | 5 | 2 | |
| 4 | 6 | 3 | |
| 5 | 5 | 5 | |
| 6 | 6 | 6 | |
| 7 | 5 | 8 | |
| 8 | 6 | 9 |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 3 | 0 | |
| 1 | 3 | 1 | 0 | |
| 2 | 4 | 1 | 0 | |
| 3 | -1 | 0 | 1 | 11 |
| 4 | 5 | 1 | 0 | |
| 5 | 6 | 1 | 0 | |
| 6 | -1 | 0 | 1 | 13 |
| 7 | 7 | 1 | 0 | |
| 8 | 8 | 1 | 0 | |
| 9 | -1 | 0 | 1 | 133 |
Depicted Tuples: {1,5,6} = 11, {6,5,6} = 13, {4,5,6} = 133
V. Insert: {5,5,6} = 143
| Children | |||
|---|---|---|---|
| I | Tuple Index | Node Index | |
| 0 | 1 | 1 | |
| 1 | 4 | 7 | |
| 2 | 5 | 10 | <-INSERT |
| 3 | 6 | 4 | |
| 4 | 5 | 2 | |
| 5 | 6 | 3 | |
| 6 | 5 | 5 | |
| 7 | 6 | 6 | |
| 8 | 5 | 8 | |
| 9 | 6 | 9 | |
| 10 | 5 | 11 | |
| 11 | 6 | 12 |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 4 | 0 | |
| 1 | 4 | 1 | 0 | |
| 2 | 5 | 1 | 0 | |
| 3 | -1 | 0 | 1 | 11 |
| 4 | 6 | 1 | 0 | |
| 5 | 7 | 1 | 0 | |
| 6 | -1 | 0 | 1 | 13 |
| 7 | 8 | 1 | 0 | |
| 8 | 9 | 1 | 0 | |
| 9 | -1 | 0 | 1 | 133 |
| 10 | 10 | 1 | 0 | |
| 11 | 11 | 1 | 0 | |
| 12 | -1 | 0 | 1 | 143 |
Depicted Tuples: {1,5,6} = 11, {6,5,6} = 13, {4,5,6} = 133, {5,5,6} = 143
VI. Insert: {0,5,6} = 155
| Children | |||
|---|---|---|---|
| I | Tuple Index | Node Index | |
| 0 | 0 | 13 | <-INSERT |
| 1 | 1 | 1 | |
| 2 | 4 | 7 | |
| 3 | 5 | 10 | |
| 4 | 6 | 4 | |
| 5 | 5 | 2 | |
| 6 | 6 | 3 | |
| 7 | 5 | 5 | |
| 8 | 6 | 6 | |
| 9 | 5 | 8 | |
| 10 | 6 | 9 | |
| 11 | 5 | 11 | |
| 12 | 6 | 12 | |
| 13 | 5 | 14 | |
| 14 | 6 | 15 |
| Nodes | ||||
|---|---|---|---|---|
| I | Child Offset | NumChildren | Leaf | Value |
| 0 | 0 | 5 | 0 | |
| 1 | 5 | 1 | 0 | |
| 2 | 6 | 1 | 0 | |
| 3 | -1 | 0 | 1 | 11 |
| 4 | 7 | 1 | 0 | |
| 5 | 8 | 1 | 0 | |
| 6 | -1 | 0 | 1 | 13 |
| 7 | 9 | 1 | 0 | |
| 8 | 10 | 1 | 0 | |
| 9 | -1 | 0 | 1 | 133 |
| 10 | 11 | 1 | 0 | |
| 11 | 12 | 1 | 0 | |
| 12 | -1 | 0 | 1 | 143 |
| 13 | 13 | 1 | 0 | |
| 14 | 14 | 1 | 0 | |
| 15 | -1 | 0 | 1 | 155 |
Depicted Tuples: {1,5,6} = 11, {6,5,6} = 13, {4,5,6} = 133, {5,5,6} = 143, {0,5,6} = 155
The multiplication operation was implemented in 5 different ways:
- Tuple Iteration
- Late Sequential Tuple Comparison Iteration
- Ranged Threaded Tree Multiplication
- Offset Tree Multiplication
- Blindly Threaded Tree Multiplication
The general idea is that:
- we would like to loop over the indices of the Sparse Matrix that has less size,
- then retieve the tuples for each matrix and compare the tuples.
- if the tuples are equal we perform the multiplication of their values,
- otherwise, we compare and move the iterators accordingly; if Ta was larger than Tb, we move Itb++ otherwise we move Ita++ further, by keeping in mind if the iterators reached at their end.
| Name | Steps per Method |
|---|---|
| Tuple Iteration | 1. Retieve the tuples for each matrix and compare the tuples. |
| 2. if the tuples are equal we perform the multiplication of their values, | |
| 3. otherwise, we compare and move the iterators accordingly; if Ta was larger than Tb, we move Itb++ otherwise | |
| we move Ita++ further, by keeping in mind if the iterators reached at their end. | |
| Late Sequential Tuple Comparison Iteration | 1. Retieve the tuples for each matrix and compare the tuples. |
| 2. First iterate to the next pair tuples, then compare. | |
| Offset Tree Multiplication | 1. Find common indices per tuple position, starting from 0. |
| 2. If there are no common indices, the multiplication result is 0. | |
| 3. Obtain the common indices and continue to the next level (tuple position), then repeat until you find the value (isLeaf == true). | |
| 4. Perform insert on the common tuple and assign the multiplication result. | |
| Blindly Threaded Tree Multiplication | 1.Find common indices per tuple position, starting from 0. |
| 2. If there are no common indices, the multiplication result is 0. | |
| 3. For each common index, start a thread that reduces the common index tree to the multiplication result, by recursively traverses to common indices inside both Trees. | |
| 4. The thread performs an insert operation on the common tuple and assign the multiplication result. | |
| Ranged Threaded Tree Multiplication | 1.Find common indices per tuple position, starting from 0. |
| 2. If there are no common indices, the multiplication result is 0. | |
| 3. For each common index, store the index in an array that has max limit of common indices. | |
| 4. If the array limit is surpassed, then start a thread that reduces the common index tree to the multiplication result, by recursively traverse to common indices inside both Trees. The started thread operates in the array of common indices-offsets. |
The default operator* for the Sparse Matrix is the Ranged Threaded Tree Multiplication algorithm, because it is the fastest.
The remaining multiplication method to be implemented is by using our Sparse Matrices within CUDA ; that was also the reason of the existence of this whole implementation, to measure all implementations against the CUDA C++ one and see who is the winner.