Conversation
penzn
left a comment
There was a problem hiding this comment.
Sorry, came across by accident.
Looks like SIMD mul technically processes one element, usually for matrices it is easier to write a vector dot product and then build the rest on top of it.
Additionally, floating point arithmetic is more interesting, as floating operations are more expensive.
examples/rust/simd/src/lib.rs
Outdated
| fn mul(a: u64, b: u64) -> u64 { | ||
| let va: v128 = u64x2_splat(a); | ||
| let vb: v128 = u64x2_splat(b); | ||
| let c = u64x2_extract_lane::<1>(i64x2_mul(va, vb)); |
There was a problem hiding this comment.
I think this is technically a scalar multiplication - it fills all lanes with the same value and then extracts just one value out of the result.
There was a problem hiding this comment.
Thank you so much for your review. I wasn't familiar with simd and made a mistake. Would appreciate feedbacks on the new code.
examples/rust/simd/src/lib.rs
Outdated
| fn dot(a: Vec<u64>, b: Vec<u64>) -> u64 { | ||
| assert!(a.len() == b.len()); | ||
| let mut sum: u64 = 0; | ||
| for i in 0..a.len() { | ||
| sum += Self::mul(a[i], b[i]); | ||
| } | ||
| sum | ||
| } |
There was a problem hiding this comment.
Dot product is the smallest unit of work in matrix multiplication that can be implemented in SIMD, it usually works by taking N worth of elements from the first array and second array, multiplying them via SIMD, then adding N results to the intermediate vector sum (N is number of lanes). Intermediate sum is the added up at the end, also for input sizes not divisible by N the remainder needs to be calculated manually.
There was a problem hiding this comment.
Updated. Please let me know if anything I could do better. I assume floating point implementation should be similar (please let me know if it isn't) so I will update floating point examples once this is ok :)
| u64x2-scalar-mul: func(a: u64, b: list<u64>) -> list<u64> | ||
| u64x2-dot: func(a: list<u64>, b: list<u64>) -> u64 | ||
| u64x2-inner: func(a: list<u64>, b: list<u64>) -> list<u64> | ||
| u64x2-mat-mul: func(a: list<list<u64>>, b: list<list<u64>>) -> list<list<u64>> |
There was a problem hiding this comment.
rather than using list you should use singlestore compatible packed 64 bit vectors:
| use core::arch::wasm32::*; | ||
|
|
||
| impl simd::Simd for Simd { | ||
| fn u64x2_scalar_mul(a: u64, b: Vec<u64>) -> Vec<u64> { |
There was a problem hiding this comment.
please add docstrings to each function explaining it's purpose
Implements #11