Skip to content
Draft
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
a187efb
Domain option version of get_dimension_index
rroelke Jun 5, 2025
cb22d35
Dimension::cell_size
rroelke Jun 5, 2025
760f643
Add tile global/order min/max, bump format version, not tested
rroelke Jun 5, 2025
fb387bd
Stubs for C API retrieving tile global order bounds
rroelke Jul 21, 2025
d3d9936
Fixed-length simple test passes
rroelke Aug 5, 2025
273a0bb
Fix test with serialization on
rroelke Aug 5, 2025
6ad6ac9
Fix set_num_tiles index for bounds buffers
rroelke Aug 6, 2025
fa5b8ee
Add tiledb::Array::context
rroelke Aug 6, 2025
01ec15b
Refactor some functions into array_templates.h
rroelke Aug 6, 2025
bfbec1f
Refactor Fragment1D, Fragment2D into common base class
rroelke Aug 7, 2025
a018738
Fragment metadata global order bounds example passes
rroelke Aug 7, 2025
7a1e6e6
Add unordered 1D test
rroelke Aug 7, 2025
b639572
Rapidcheck runs, prints minimum
rroelke Aug 8, 2025
f642e6e
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Aug 8, 2025
9c900b3
Add RestProfile::file_exists to avoid throwing exception for failing …
rroelke Aug 8, 2025
99e04db
Fix offset by tile_index_base_
rroelke Aug 11, 2025
83e2b36
Rapidcheck passes for fixed 1D no dups
rroelke Aug 11, 2025
4947b3a
Turn on dups
rroelke Aug 11, 2025
f9b93fb
Fix Fragment2d::d2
rroelke Aug 14, 2025
b997f22
Test Fragment metadata global order bounds: 2D fixed unordered section
rroelke Aug 14, 2025
52672dc
Test Fragment metadata global order bounds: 2D fixed global order sec…
rroelke Sep 15, 2025
72cdd82
Add 1D var test with trivial input
rroelke Sep 15, 2025
23e9420
Trivial 1D var test passes
rroelke Sep 16, 2025
49cd23e
Minimum write GENERATE
rroelke Sep 16, 2025
5dbee50
Fix global_cell_cmp_std_tuple for variable-length dimension
rroelke Sep 17, 2025
23c397f
Fix set_tile_global_order_bounds_var indexing
rroelke Sep 17, 2025
0e25643
Nontrival 1D var test
rroelke Sep 17, 2025
b7ab7f1
Fragment::extend
rroelke Sep 18, 2025
a0dedb9
Fix prepare_bound_buffers for empty qb
rroelke Sep 18, 2025
eed749a
Fragment metadata global order bounds: 1D fixed consolidation non-ove…
rroelke Sep 18, 2025
38b981b
Add interleaving 1d fixed consolidation test
rroelke Sep 18, 2025
f1f64a5
Additional consolidation tests, 1d fixed rapidcheck and 1d var
rroelke Sep 19, 2025
d7cc660
Domain::cell_order_cmp overload consistency
rroelke Sep 19, 2025
8805da0
Add 1D var rapidcheck test, does not pass
rroelke Sep 19, 2025
9dad9a6
Fix set_tile_global_order_bounds_var tile
rroelke Sep 19, 2025
11d2601
Shrinking section
rroelke Sep 19, 2025
654c679
rapidcheck show functions
rroelke Sep 19, 2025
af2dc5a
1D var consolidation rapidcheck
rroelke Sep 19, 2025
bb4ebaa
make_fragment_3d
rroelke Sep 22, 2025
6cb0b1d
3D vcf rapidcheck
rroelke Sep 22, 2025
722f3e0
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Sep 22, 2025
44469f8
Add C++ API and use it in tests
rroelke Sep 23, 2025
d176eb6
Add 3D vcf consolidation rapidcheck test
rroelke Sep 23, 2025
e76b0ce
Update format spec
rroelke Sep 23, 2025
029592c
Only write global order min/maxes for sparse array
rroelke Sep 23, 2025
36d624c
Allow dimension_sizes arg to be nullptr
rroelke Sep 23, 2025
369e02a
Recapture output for 'dump with string dimension' test
rroelke Sep 23, 2025
045beb4
Fix interleaving triples output
rroelke Sep 23, 2025
0e4d960
Fix cpp context lifetime issue
rroelke Sep 23, 2025
4d08b0c
constexpr UntypedDatumView methods/constructors
rroelke Sep 23, 2025
5886af9
Fix cppapi impl dimension_sizes
rroelke Sep 23, 2025
804051e
std::min explicit template to make mac compiler happy
rroelke Sep 23, 2025
3c316af
Fix format spec typos and version number
rroelke Oct 13, 2025
0434459
Separate method for new versioning
rroelke Oct 13, 2025
47df461
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Oct 13, 2025
12fec8b
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Nov 5, 2025
f187ce3
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Nov 5, 2025
5c0b0af
Merge remote-tracking branch 'origin/main' into rr/core-321-fragment-…
rroelke Nov 6, 2025
677fa62
Restore Dimension<STRING_ASCII> specialization
rroelke Nov 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions format_spec/fragment.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@ The fragment metadata file has the following on-disk format:
| Tile maxes for attribute/dimension 1 | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 11_ The serialized maxes for attribute/dimension 1 |
| … | … | … |
| Variable maxes for attribute/dimension N | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 11_ The serialized maxes for attribute/dimension N |
| Tile global order min coordinates for dimension 1 | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 23_ For sparse arrays, the serialized value of dimension 1 of the global order minimum coordinate in each tile. |
| … | … | … |
| Variable global order min coordinates for dimension N | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 23_ For sparse arrays, the serialized value of dimension N of the global order minimum coordinate in each tile. |
| Tile global order max coordinates for dimension 1 | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 23_ For sparse arrays, the serialized value of dimension 1 of the global order maximum coordinate in each tile. |
| … | … | … |
| Variable global order max coordinates for dimension N | [Tile Mins/Maxes](#tile-mins-maxes) | _New in version 23_ For sparse arrays, the serialized value of dimension N of the global order maximum coordinate in each tile. |
| Tile sums for attribute/dimension 1 | [Tile Sums](#tile-sums) | _New in version 11_ The serialized sums for attribute/dimension 1 |
| … | … | … |
| Variable sums for attribute/dimension N | [Tile Sums](#tile-sums) | _New in version 11_ The serialized sums for attribute/dimension N |
Expand Down Expand Up @@ -276,6 +282,12 @@ The footer is a simple blob \(i.e., _not a generic tile_\) with the following in
| Tile maxes offset for attribute/dimension 1 | `uint64_t` | The offset to the generic tile storing the tile maxes for attribute/dimension 1. |
| … | … | … |
| Tile maxes offset for attribute/dimension N | `uint64_t` | The offset to the generic tile storing the tile maxes for attribute/dimension N |
| Tile global order min coordinates offset for dimension 1 | `uint64_t` | _New in version 23_ For sparse arrays, he offset to the generic tile storing the tile global order mins for dimension 1
| … | … | … |
| Tile global order min coordinates offset for dimension N | `uint64_t` | _New in version 23_ For sparse arrays, the offset to the generic tile storing the tile global order mins for dimension N
| Tile global order max coordinates offset for dimension 1 | `uint64_t` | _New in version 23_ For sparse arrays, the offset to the generic tile storing the tile global order maxes for dimension 1
| … | … | … |
| Tile global order max coordinates offset for dimension N | `uint64_t` | _New in version 23_ For sparse arrays, the offset to the generic tile storing the tile global order maxes for dimension N
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I would prefer phrasing it like The offset to the […] for dimension N. This field exists only on sparse arrays..
  • We have yet to decide whether this is how we want to store the tiles, or use the extensible footer design described in CORE-345. If we do the latter, and don't want to change this PR's implementation, one proposed solution was to do it in a subsequent PR that will be merged before the next release.
    • The fact that these fields are only written in the footer for sparse arrays, is another reason to pursue the extensible footer design.
      • If we stick with this design, a better and simpler idea would be to always the offsets, and write zeroes for dense arrays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have yet to decide whether this is how we want to store the tiles, or use the extensible footer design described in CORE-345

Each time this has been brought up in the meeting the overall consensus has been (1) yes, we do want to implement a more extensible footer design; (2) no, we do not want to hold up existing work so that we can do that. It is settled.

The fact that these fields are only written in the footer for sparse arrays, is another reason to pursue the extensible footer design

I agree with this, it is rather clunky. However see above.

If we stick with this design, a better and simpler idea would be to always the offsets, and write zeroes for dense arrays

It removes a few ifs. Theoretically it might also for some other developer who wanted to read the storage format. I don't really have a preference.

| Tile sums offset for attribute/dimension 1 | `uint64_t` | The offset to the generic tile storing the tile sums for attribute/dimension 1. |
| … | … | … |
| Tile sums offset for attribute/dimension N | `uint64_t` | The offset to the generic tile storing the tile sums for attribute/dimension N |
Expand Down
1 change: 1 addition & 0 deletions test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ set(TILEDB_UNIT_TEST_SOURCES
src/unit-enumerations.cc
src/unit-enum-helpers.cc
src/unit-filter-buffer.cc
src/unit-fragment-info-global-order-bounds.cc
src/unit-global-order.cc
src/unit-ordered-dim-label-reader.cc
src/unit-tile-metadata.cc
Expand Down
127 changes: 125 additions & 2 deletions test/src/unit-capi-fragment_info.cc
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
* Tests the C API functions for manipulating fragment information.
*/

#include "test/support/src/error_helpers.h"
#include "test/support/src/helpers.h"
#include "test/support/src/serialization_wrappers.h"
#include "tiledb/sm/c_api/tiledb.h"
Expand Down Expand Up @@ -553,6 +554,22 @@ TEST_CASE(
ctx, fragment_info, 1, 0, "d", &mbr[0]);
CHECK(rc == TILEDB_ERR);

// Get global order lower bound - should fail since it's a dense array
{
void* dimensions[] = {&mbr[0], &mbr[1]};
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 0, 0, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);
}

// Get global order upper bound - should fail since it's a dense array
{
void* dimensions[] = {&mbr[0], &mbr[1]};
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 0, 0, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);
}

// Get version
uint32_t version;
rc = tiledb_fragment_info_get_version(ctx, fragment_info, 0, &version);
Expand Down Expand Up @@ -703,13 +720,14 @@ TEST_CASE("C API: Test MBR fragment info", "[capi][fragment_info][mbr]") {
// Load fragment info
rc = tiledb_fragment_info_load(ctx, fragment_info);
CHECK(rc == TILEDB_OK);
tiledb_config_free(&cfg);

tiledb_fragment_info_t* deserialized_fragment_info = nullptr;
if (serialized_load) {
rc = tiledb_fragment_info_alloc(
ctx, array_name.c_str(), &deserialized_fragment_info);
CHECK(rc == TILEDB_OK);
rc = tiledb_fragment_info_set_config(ctx, deserialized_fragment_info, cfg);
CHECK(rc == TILEDB_OK);
tiledb_fragment_info_serialize(
ctx,
array_name.c_str(),
Expand All @@ -720,6 +738,8 @@ TEST_CASE("C API: Test MBR fragment info", "[capi][fragment_info][mbr]") {
fragment_info = deserialized_fragment_info;
}

tiledb_config_free(&cfg);

// Get fragment num
uint32_t fragment_num;
rc = tiledb_fragment_info_get_fragment_num(ctx, fragment_info, &fragment_num);
Expand All @@ -731,6 +751,7 @@ TEST_CASE("C API: Test MBR fragment info", "[capi][fragment_info][mbr]") {
rc = tiledb_fragment_info_get_mbr_num(ctx, fragment_info, 0, &mbr_num);
CHECK(rc == TILEDB_OK);
CHECK(mbr_num == 1);

rc = tiledb_fragment_info_get_mbr_num(ctx, fragment_info, 1, &mbr_num);
CHECK(rc == TILEDB_OK);
CHECK(mbr_num == 2);
Expand All @@ -753,6 +774,108 @@ TEST_CASE("C API: Test MBR fragment info", "[capi][fragment_info][mbr]") {
CHECK(rc == TILEDB_OK);
CHECK(mbr == std::vector<uint64_t>{7, 8});

// Get global order lower bounds
{
std::vector<uint64_t> lower_bound(2);
void* dimensions[] = {&lower_bound[0], &lower_bound[1]};

// first fragment - one tile
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 0, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(lower_bound == std::vector<uint64_t>{1, 1});
}
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 0, 1, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);

// second fragment - two tiles
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 1, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(lower_bound == std::vector<uint64_t>{1, 1});
}
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 1, 1, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(lower_bound == std::vector<uint64_t>{7, 7});
}
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 1, 2, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);

// third fragment - two tiles
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 2, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(lower_bound == std::vector<uint64_t>{1, 1});
}
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 2, 1, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(lower_bound == std::vector<uint64_t>{1, 8});
}
rc = tiledb_fragment_info_get_global_order_lower_bound(
ctx, fragment_info, 2, 2, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);
}

// Get global order upper bounds
{
std::vector<uint64_t> upper_bound(2);
void* dimensions[] = {&upper_bound[0], &upper_bound[1]};

// first fragment - one tile
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 0, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(upper_bound == std::vector<uint64_t>{2, 2});
}
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 0, 1, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);

// second fragment - two tiles
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 1, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(upper_bound == std::vector<uint64_t>{2, 2});
}
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 1, 1, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(upper_bound == std::vector<uint64_t>{8, 8});
}
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 1, 2, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);

// third fragment - two tiles
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 2, 0, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(upper_bound == std::vector<uint64_t>{2, 2});
}
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 2, 1, nullptr, &dimensions[0]);
CHECK(error_if_any(ctx, rc) == std::nullopt);
if (rc == TILEDB_OK) {
CHECK(upper_bound == std::vector<uint64_t>{7, 7});
}
rc = tiledb_fragment_info_get_global_order_upper_bound(
ctx, fragment_info, 2, 2, nullptr, &dimensions[0]);
CHECK(rc == TILEDB_ERR);
}

// Clean up
tiledb_fragment_info_free(&fragment_info);
remove_dir(array_name, ctx, vfs);
Expand Down Expand Up @@ -1842,7 +1965,7 @@ TEST_CASE(
"- Unconsolidated metadata num: 1\n" + "- To vacuum num: 0\n" +
"- Fragment #1:\n" + " > URI: " + written_frag_uri + "\n" +
" > Schema name: " + schema_name + "\n" + " > Type: sparse\n" +
" > Non-empty domain: [a, ddd]\n" + " > Size: 3439\n" +
" > Non-empty domain: [a, ddd]\n" + " > Size: 3674\n" +
" > Cell num: 4\n" + " > Timestamp range: [1, 1]\n" +
" > Format version: " + ver + "\n" +
" > Has consolidated metadata: no\n";
Expand Down
Loading
Loading