Skip to content

Commit f826913

Browse files
authored
Implement AVX2 distance computations (#261)
## Scoring functions This PR contains implementations of distance computations: L2 (sum of squares), inner product, and cosine. For each type of distance computation, naive, 4x unrolled, AVX, and BLAS implementations are provided. In addition, for each of the preceding, versions that compute the distance over a specified view of the two vectors is also provided. The PR implements AVX versions of l2 distance functions and inner product distance functions, for all four combinations of distance between `float` and `uint8_t` vectors. (Note the `uint8_t` requirement can be relaxed to any type that can be converted to a `float`). In anticipation of parameterization of query functions by distance functions, the different families of computations are separated into their own header files, and they are all contained in `detail/scoring`. Function objects are contained in `scoring.h` to dispatch to the actual distance function implementations. For architectures that support it, AVX2 is the default. ### Performance The AVX implementations of l2 distance computations are the default. This should result in a substantial performance boost over the existing implementations. Measurements I have made show 8X to 10X speedup of l2 distance computation simulating "sift small" -- i.e., computing the pair-wise distance between a set of 10,000 vectors and a set of 100 vectors. For machines that do not have AVX2, a simple unrolled version is used instead. (These are dispatched through the `l2_distance` function.) ### Naive implementations The "naive" implementations are just simple loops over two vectors, e.g., ``` template <feature_vector V, feature_vector W> requires std::same_as<typename V::value_type, float> && std::same_as<typename W::value_type, float> inline float naive_sum_of_squares(const V& a, const W& b) { size_t size_a = size(a); float sum = 0.0; for (size_t i = 0; i < size_a; ++i) { float diff = a[i] - b[i]; sum += diff * diff; } return sum; } ``` All of the distance functions are templated on the type of vector they compute over accept and the vectors are required to meet the requirements of `feature_vector`. Because of the need to case non-`float` elements, there are four concept-based overloads for each function, depending on the `value_type` of each vector. The overloads are for `float`-`float`, `float`-`uint8_t`, `uint8_t`-`float`, and `uint8_t`-`uint8_t`. An overload for `float`-`uint8_t` ``` template <feature_vector V, feature_vector W> requires std::same_as<typename V::value_type, float> && std::same_as<typename W::value_type, uint8_t> inline float naive_sum_of_squares(const V& a, const W& b) { size_t size_a = size(a); float sum = 0.0; for (size_t i = 0; i < size_a; ++i) { float diff = a[i] - (float) b[i]; sum += diff * diff; } return sum; } ``` ### Unrolled There are unrolled versions of the distance functions, which use a very basic unrolling to provide a moderate performance optimization. ```c++ template <feature_vector V, feature_vector W> requires std::same_as<typename V::value_type, float> && std::same_as<typename W::value_type, float> inline float unroll4_sum_of_squares(const V& a, const W& b) { size_t size_a = size(a); size_t stop = 4 * (size_a / 4); float sum = 0.0; for (size_t i = 0; i < stop; i += 4) { float diff0 = a[i + 0] - b[i + 0]; float diff1 = a[i + 1] - b[i + 1]; float diff2 = a[i + 2] - b[i + 2]; float diff3 = a[i + 3] - b[i + 3]; sum += diff0 * diff0 + diff1 * diff1 + diff2 * diff2 + diff3 * diff3; } // Clean up for (size_t i = stop; i < size_a; ++i) { float diff0 = a[i + 0] - b[i + 0]; sum += diff0 * diff0; } return sum; } ``` ### Distance over a view Overloads of the distance functions are also provided to compute distance over just a (contiguous) portion of two vectors. ```c++ template <feature_vector V, feature_vector W> requires std::same_as<typename V::value_type, float> && std::same_as<typename W::value_type, float> inline float naive_sum_of_squares( const V& a, const W& b, size_t start, size_t stop) { float sum = 0.0; for (size_t i = start; i < stop; ++i) { float diff = a[i] - b[i]; sum += diff * diff; } return sum; } ``` ### AVX The non-view L2 distance functions have been implemented with AVX2 instructions, which can provide a substantial performance improvement (8X to 10X) over plain C++. Using intrinsics, the basic body of the distance function is quite straightforward: ```c++ for (size_t i = start; i < stop; i += 8) { // Load 8 floats __m256 vec_a = _mm256_loadu_ps(a_ptr + i + 0); __m256 vec_b = _mm256_loadu_ps(b_ptr + i + 0); // Compute the difference __m256 diff = _mm256_sub_ps(vec_a, vec_b); // Square and accumulate vec_sum = _mm256_fmadd_ps(diff, diff, vec_sum); } ``` This loads 8 `float`s from vectors `a` and `b` into 256-bit registers and computes the pairwise distance between 8 floats in (SIMD) parallel. The 8 `float`s need to be reduced to a single `float`: ``` // 8 to 4 __m128 lo = _mm256_castps256_ps128(vec_sum); __m128 hi = _mm256_extractf128_ps(vec_sum, 1); __m128 combined = _mm_add_ps(lo, hi); // 4 to 2 combined = _mm_hadd_ps(combined, combined); // 2 to 1 combined = _mm_hadd_ps(combined, combined); float sum = _mm_cvtss_f32(combined); ``` ### Function objects To enable different distance functions to be passed as parameters to queries, the file `scoring.h` defines some function objects to wrap the raw functions: ``` namespace _l2_distance { struct sum_of_squares_distance { template <feature_vector V, feature_vector U> constexpr inline float operator()(const V& a, const U& b) const { return avx2_sum_of_squares(a, b); } }; } // namespace _l2_distance using sum_of_squares_distance = _l2_distance::sum_of_squares_distance; inline constexpr auto l2_distance = _l2_distance::sum_of_squares_distance{}; ``` With these definitions, users can compute L2 distance by calling `l2_distance()`: ``` auto dist = l2_distance(x, y); ``` Users can pass the type `sum_of_squared_distance` where it is required as a template argument, for instance. ### Implementation status: | Metric | Naive | 4x unrolled | AVX | BLAS | |------------|-------|-------------|-----|--------| | L2 | Y | Y | Y | N | | Dot | Y | Y | Y | N | | Cosine | N | N | N | N | | L2 w/view | Y | Y | N | N | | Dot w/view | N | N | N | N | | Cosine | N | N | N | N | NOTE: Cosine is just dot using normalized vectors. One approach to computing cosine similarity is to first normalize the vectors, rather than normalizing them on the fly.
1 parent 7b2b3b7 commit f826913

28 files changed

+2818
-299
lines changed

src/include/cpos.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
#define TILEDB_CPOS_H
3636

3737
#include <concepts>
38+
#include <vector>
3839

3940
template <class T>
4041
concept semi_integral = std::integral<T> && !std::same_as<T, bool>;

src/include/detail/flat/qv.h

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ template <feature_vector_array DB, query_vector_array Q>
9595
Vector<score_type> scores(size_db);
9696

9797
for (size_t i = 0; i < size_db; ++i) {
98-
scores[i] = L2(q_vec, db[i]);
98+
scores[i] = l2_distance(q_vec, db[i]);
9999
}
100100
get_top_k_from_scores(scores, top_k[j], k_nn);
101101
});
@@ -309,10 +309,10 @@ auto qv_query_heap_tiled(
309309
std::min<size_t>(num_vectors(db), 2 * (num_vectors(db) / 2));
310310

311311
for (size_t kp = 0; kp < kstop; kp += 2) {
312-
auto score_00 = L2(q_vec_0, db[kp + 0]);
313-
auto score_01 = L2(q_vec_0, db[kp + 1]);
314-
auto score_10 = L2(q_vec_1, db[kp + 0]);
315-
auto score_11 = L2(q_vec_1, db[kp + 1]);
312+
auto score_00 = l2_distance(q_vec_0, db[kp + 0]);
313+
auto score_01 = l2_distance(q_vec_0, db[kp + 1]);
314+
auto score_10 = l2_distance(q_vec_1, db[kp + 0]);
315+
auto score_11 = l2_distance(q_vec_1, db[kp + 1]);
316316

317317
if constexpr (std::is_same_v<T, with_ids>) {
318318
min_scores[j0].insert(score_00, ids[kp + 0]);
@@ -333,8 +333,8 @@ auto qv_query_heap_tiled(
333333
* Cleanup the last iteration(s) of k
334334
*/
335335
for (size_t kp = kstop; kp < num_vectors(db); ++kp) {
336-
auto score_00 = L2(q_vec_0, db[kp + 0]);
337-
auto score_10 = L2(q_vec_1, db[kp + 0]);
336+
auto score_00 = l2_distance(q_vec_0, db[kp + 0]);
337+
auto score_10 = l2_distance(q_vec_1, db[kp + 0]);
338338

339339
if constexpr (std::is_same_v<T, with_ids>) {
340340
min_scores[j0].insert(score_00, ids[kp + 0]);
@@ -359,8 +359,8 @@ auto qv_query_heap_tiled(
359359
auto kstop =
360360
std::min<size_t>(num_vectors(db), 2 * (num_vectors(db) / 2));
361361
for (size_t kp = 0; kp < kstop; kp += 2) {
362-
auto score_00 = L2(q_vec_0, db[kp + 0]);
363-
auto score_01 = L2(q_vec_0, db[kp + 1]);
362+
auto score_00 = l2_distance(q_vec_0, db[kp + 0]);
363+
auto score_01 = l2_distance(q_vec_0, db[kp + 1]);
364364

365365
if constexpr (std::is_same_v<T, with_ids>) {
366366
min_scores[j0].insert(score_00, ids[kp + 0]);
@@ -374,7 +374,7 @@ auto qv_query_heap_tiled(
374374
}
375375
}
376376
for (size_t kp = kstop; kp < num_vectors(db); ++kp) {
377-
auto score_00 = L2(q_vec_0, db[kp + 0]);
377+
auto score_00 = l2_distance(q_vec_0, db[kp + 0]);
378378
if constexpr (std::is_same_v<T, with_ids>) {
379379
min_scores[j0].insert(score_00, ids[kp + 0]);
380380
} else if constexpr (std::is_same_v<T, without_ids>) {

src/include/detail/flat/vq.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ auto vq_query_heap(
8888
[&, size_q](auto&& db_vec, auto&& n = 0, auto&& i = 0) {
8989
size_t index = i + col_offset(db);
9090
for (size_t j = 0; j < size_q; ++j) {
91-
auto score = L2(q[j], db_vec);
91+
auto score = l2_distance(q[j], db_vec);
9292
if constexpr (std::is_same_v<T, with_ids>) {
9393
scores[n][j].insert(score, ids[index]);
9494
} else if constexpr (std::is_same_v<T, without_ids>) {
@@ -177,7 +177,7 @@ auto vq_query_heap_tiled(
177177
std::remove_cvref_t<decltype(i)> index = 0;
178178
index = i + col_offset(db);
179179
for (size_t j = 0; j < size_q; ++j) {
180-
auto score = L2(q[j], db_vec);
180+
auto score = l2_distance(q[j], db_vec);
181181
if constexpr (std::is_same_v<T, with_ids>) {
182182
scores[n][j].insert(score, ids[index]);
183183
} else if constexpr (std::is_same_v<T, without_ids>) {
@@ -247,7 +247,7 @@ auto vq_query_heap_2(
247247
index = i + col_offset(db);
248248

249249
for (size_t j = 0; j < size_q; ++j) {
250-
auto score = L2(q[j], db_vec);
250+
auto score = l2_distance(q[j], db_vec);
251251
if constexpr (std::is_same_v<T, with_ids>) {
252252
scores[n][j].insert(score, ids[index]);
253253
} else if constexpr (std::is_same_v<T, without_ids>) {

src/include/detail/graph/diskann.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ auto read_diskann_mem_index_with_scores(
138138
for (size_t i = 0; i < num_neighbors; ++i) {
139139
uint32_t id;
140140
binary_file.read((char*)&id, 4);
141-
g.add_edge(node, id, sum_of_squares(x[node], x[id]));
141+
g.add_edge(node, id, l2_distance(x[node], x[id]));
142142
}
143143
// @todo ??? Is this right ???
144144
binary_file.seekg(max_degree - num_neighbors, std::ios_base::cur);

src/include/detail/ivf/qv.h

Lines changed: 48 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ auto qv_query_heap_infinite_ram(
136136
size_t stop = indices[top_centroids(p, j) + 1];
137137

138138
for (size_t i = start; i < stop; ++i) {
139-
auto score = L2(q_vec /*q[j]*/, partitioned_vectors[i]);
139+
auto score = l2_distance(q_vec /*q[j]*/, partitioned_vectors[i]);
140140
min_scores[j].insert(score, partitioned_ids[i]);
141141
}
142142
}
@@ -229,7 +229,7 @@ auto nuv_query_heap_infinite_ram(
229229
// for (size_t k = start; k < stop; ++k) {
230230
// auto kp = k - partitioned_vectors.col_offset();
231231
for (size_t kp = start; kp < stop; ++kp) {
232-
auto score = L2(q_vec, partitioned_vectors[kp]);
232+
auto score = l2_distance(q_vec, partitioned_vectors[kp]);
233233

234234
// @todo any performance with apparent extra indirection?
235235
// (Compiler should do the right thing, but...)
@@ -337,10 +337,14 @@ auto nuv_query_heap_infinite_ram_reg_blocked(
337337
auto q_vec_1 = query[j1];
338338

339339
for (size_t kp = start; kp < kstop; kp += 2) {
340-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
341-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
342-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
343-
auto score_11 = L2(q_vec_1, partitioned_vectors[kp + 1]);
340+
auto score_00 =
341+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
342+
auto score_01 =
343+
l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
344+
auto score_10 =
345+
l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
346+
auto score_11 =
347+
l2_distance(q_vec_1, partitioned_vectors[kp + 1]);
344348

345349
min_scores[n][j0].insert(score_00, partitioned_ids[kp + 0]);
346350
min_scores[n][j0].insert(score_01, partitioned_ids[kp + 1]);
@@ -352,8 +356,10 @@ auto nuv_query_heap_infinite_ram_reg_blocked(
352356
* Cleanup the last iteration(s) of k
353357
*/
354358
for (size_t kp = kstop; kp < stop; ++kp) {
355-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
356-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
359+
auto score_00 =
360+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
361+
auto score_10 =
362+
l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
357363
min_scores[n][j0].insert(score_00, partitioned_ids[kp + 0]);
358364
min_scores[n][j1].insert(score_10, partitioned_ids[kp + 0]);
359365
}
@@ -367,14 +373,17 @@ auto nuv_query_heap_infinite_ram_reg_blocked(
367373
auto q_vec_0 = query[j0];
368374

369375
for (size_t kp = start; kp < kstop; kp += 2) {
370-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
371-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
376+
auto score_00 =
377+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
378+
auto score_01 =
379+
l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
372380

373381
min_scores[n][j0].insert(score_00, partitioned_ids[kp + 0]);
374382
min_scores[n][j0].insert(score_01, partitioned_ids[kp + 1]);
375383
}
376384
for (size_t kp = kstop; kp < stop; ++kp) {
377-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
385+
auto score_00 =
386+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
378387
min_scores[n][j0].insert(score_00, partitioned_ids[kp + 0]);
379388
}
380389
}
@@ -509,7 +518,7 @@ auto nuv_query_heap_finite_ram(
509518
* Apply the query to the partition.
510519
*/
511520
for (size_t kp = start; kp < stop; ++kp) {
512-
auto score = L2(q_vec, partitioned_vectors[kp]);
521+
auto score = l2_distance(q_vec, partitioned_vectors[kp]);
513522

514523
// @todo any performance with apparent extra indirection?
515524
min_scores[n][j].insert(
@@ -644,10 +653,14 @@ auto nuv_query_heap_finite_ram_reg_blocked(
644653
auto q_vec_1 = query[j1];
645654

646655
for (size_t kp = start; kp < kstop; kp += 2) {
647-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
648-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
649-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
650-
auto score_11 = L2(q_vec_1, partitioned_vectors[kp + 1]);
656+
auto score_00 =
657+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
658+
auto score_01 =
659+
l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
660+
auto score_10 =
661+
l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
662+
auto score_11 =
663+
l2_distance(q_vec_1, partitioned_vectors[kp + 1]);
651664

652665
min_scores[n][j0].insert(
653666
score_00, partitioned_vectors.ids()[kp + 0]);
@@ -663,8 +676,10 @@ auto nuv_query_heap_finite_ram_reg_blocked(
663676
* Cleanup the last iteration(s) of k
664677
*/
665678
for (size_t kp = kstop; kp < stop; ++kp) {
666-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
667-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
679+
auto score_00 =
680+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
681+
auto score_10 =
682+
l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
668683
min_scores[n][j0].insert(
669684
score_00, partitioned_vectors.ids()[kp + 0]);
670685
min_scores[n][j1].insert(
@@ -680,16 +695,19 @@ auto nuv_query_heap_finite_ram_reg_blocked(
680695
auto q_vec_0 = query[j0];
681696

682697
for (size_t kp = start; kp < kstop; kp += 2) {
683-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
684-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
698+
auto score_00 =
699+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
700+
auto score_01 =
701+
l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
685702

686703
min_scores[n][j0].insert(
687704
score_00, partitioned_vectors.ids()[kp + 0]);
688705
min_scores[n][j0].insert(
689706
score_01, partitioned_vectors.ids()[kp + 1]);
690707
}
691708
for (size_t kp = kstop; kp < stop; ++kp) {
692-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
709+
auto score_00 =
710+
l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
693711
min_scores[n][j0].insert(
694712
score_00, partitioned_vectors.ids()[kp + 0]);
695713
}
@@ -880,10 +898,10 @@ auto apply_query(
880898
auto q_vec_1 = query[j1];
881899

882900
for (size_t kp = start; kp < kstop; kp += 2) {
883-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
884-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
885-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
886-
auto score_11 = L2(q_vec_1, partitioned_vectors[kp + 1]);
901+
auto score_00 = l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
902+
auto score_01 = l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
903+
auto score_10 = l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
904+
auto score_11 = l2_distance(q_vec_1, partitioned_vectors[kp + 1]);
887905

888906
min_scores[j0].insert(score_00, partitioned_ids[kp + 0]);
889907
min_scores[j0].insert(score_01, partitioned_ids[kp + 1]);
@@ -895,8 +913,8 @@ auto apply_query(
895913
* Cleanup the last iteration(s) of k
896914
*/
897915
for (size_t kp = kstop; kp < stop; ++kp) {
898-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
899-
auto score_10 = L2(q_vec_1, partitioned_vectors[kp + 0]);
916+
auto score_00 = l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
917+
auto score_10 = l2_distance(q_vec_1, partitioned_vectors[kp + 0]);
900918
min_scores[j0].insert(score_00, partitioned_ids[kp + 0]);
901919
min_scores[j1].insert(score_10, partitioned_ids[kp + 0]);
902920
}
@@ -910,14 +928,14 @@ auto apply_query(
910928
auto q_vec_0 = query[j0];
911929

912930
for (size_t kp = start; kp < kstop; kp += 2) {
913-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
914-
auto score_01 = L2(q_vec_0, partitioned_vectors[kp + 1]);
931+
auto score_00 = l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
932+
auto score_01 = l2_distance(q_vec_0, partitioned_vectors[kp + 1]);
915933

916934
min_scores[j0].insert(score_00, partitioned_ids[kp + 0]);
917935
min_scores[j0].insert(score_01, partitioned_ids[kp + 1]);
918936
}
919937
for (size_t kp = kstop; kp < stop; ++kp) {
920-
auto score_00 = L2(q_vec_0, partitioned_vectors[kp + 0]);
938+
auto score_00 = l2_distance(q_vec_0, partitioned_vectors[kp + 0]);
921939
min_scores[j0].insert(score_00, partitioned_ids[kp + 0]);
922940
}
923941
}

src/include/detail/linalg/matrix_with_ids.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -74,17 +74,17 @@ class MatrixWithIds : public Matrix<T, LayoutPolicy, I> {
7474
virtual ~MatrixWithIds() = default;
7575

7676
MatrixWithIds(
77-
Base::size_type nrows,
78-
Base::size_type ncols,
77+
typename Base::size_type nrows,
78+
typename Base::size_type ncols,
7979
LayoutPolicy policy = LayoutPolicy()) noexcept
8080
requires(std::is_same_v<LayoutPolicy, stdx::layout_right>)
8181
: Base(nrows, ncols, policy)
8282
, ids_(this->num_rows_) {
8383
}
8484

8585
MatrixWithIds(
86-
Base::size_type nrows,
87-
Base::size_type ncols,
86+
typename Base::size_type nrows,
87+
typename Base::size_type ncols,
8888
LayoutPolicy policy = LayoutPolicy()) noexcept
8989
requires(std::is_same_v<LayoutPolicy, stdx::layout_left>)
9090
: Base(nrows, ncols, policy)
@@ -94,8 +94,8 @@ class MatrixWithIds : public Matrix<T, LayoutPolicy, I> {
9494
MatrixWithIds(
9595
std::unique_ptr<T[]>&& storage,
9696
std::vector<IdsType>&& ids,
97-
Base::size_type nrows,
98-
Base::size_type ncols,
97+
typename Base::size_type nrows,
98+
typename Base::size_type ncols,
9999
LayoutPolicy policy = LayoutPolicy()) noexcept
100100
: Base(std::move(storage), nrows, ncols, policy)
101101
, ids_{std::move(ids)} {

0 commit comments

Comments
 (0)