[#254]:svarga:refactor, remove compiler_meta_t and field_descriptor_t from type engine by steven-varga · Pull Request #255 · vargalabs/h5cpp

steven-varga · 2026-05-19T21:17:56Z

Removes compiler_meta_t<T>, field_descriptor_t<owner,field>,
is_reflected_compound_t<T>, and metadata_version from H5Tmeta.hpp.

These were plugin-output scaffolding with no independent value: the only
consumer was the h5cpp-compiler plugin (no released users), C++26 reflection
bypasses them entirely in favour of a direct storage_traits_impl_t
partial specialisation, and the old dt_t / h5::create<T>() path is
cleaner for the manual-registration case that will always need an escape hatch.

Zero tests exercised these types. All dependent specialisations
(storage_traits_impl_t, is_transport_contiguous_impl_t,
access_traits_t for is_reflected_compound_t<T>) are removed with them.
The three !is_reflected_compound_t<T>::value no-op guards on the
contiguous-sequence access_traits_t specialisations are also dropped.

What survives unchanged: storage_traits_impl_t backbone, is_transport_contiguous_t,
access_traits_t, resolved_type_t fallback to dt_t<T>.

The C++26 auto-reflection path lands directly into storage_traits_impl_t
in a later PR with no dependency on the removed types.

The GitHub releases download counter is low-signal for a header-only library — most consumers vendor h5cpp via git submodule or copy the header tree directly, so the counter measures only the subset who pull .deb / .rpm / .pkg / .exe assets from Releases. Recent release-asset issues (v1.12.4 needed manual upload after the Package workflow failed on macOS and Windows) make any short-term number further unreliable. Removed: [![Downloads](https://img.shields.io/github/downloads/vargalabs/h5cpp/total)](https://github.com/vargalabs/h5cpp/releases) The remaining badges (CI, ASan, UBSan, codecov, MIT, DOI, GitHub release, Documentation) carry meaningful signals and stay.

…filename CMake project() was hardcoded to VERSION 1.10.4.6, so CPack-produced artifacts on the v1.12.4 release carried the stale 1.10.4 string in their filenames despite the actual release tag. The v1.12.4 release required manual rename + upload of the Linux artifacts as a result. This commit: 1. Derives PROJECT_VERSION from the latest git tag matching v[0-9]* before the project() call, so CPack picks it up correctly: find_package(Git QUIET) execute_process(COMMAND git describe --tags --abbrev=0 ...) if(H5CPP_GIT_TAG MATCHES "^v([0-9]+(\\.[0-9]+)*)$") set(H5CPP_PROJECT_VERSION "${CMAKE_MATCH_1}") endif() project(libh5cpp-dev VERSION ${H5CPP_PROJECT_VERSION} ...) Falls back to "1.12.0" placeholder for tarball builds without git history, and for branches where the latest reachable tag doesn't match the strict vX.Y.Z[.W] pattern (e.g. staging sees v1.10.4-6 which is filtered out by the regex). 2. Decouples H5CPP_HDF5_FLOOR from PROJECT_VERSION. The previous check compared HDF5_VERSION against PROJECT_VERSION, which coupled package versioning to HDF5 minimum requirements purely by coincidence — 1.10.4.6 happened to be both the h5cpp release tag AND the HDF5 minimum. After #234 raised the floor to 1.12 and PROJECT_VERSION now tracks the git tag, the comparison is no longer meaningful. Pinned to "1.12.0" explicitly. 3. Sets CPACK_PACKAGE_FILE_NAME to include CMAKE_SYSTEM_NAME and CMAKE_SYSTEM_PROCESSOR (lowercased), so amd64 / arm64 / x86_64 / aarch64 / darwin-arm64 / windows-amd64 produce distinct filenames when collected into the Publish job's flat dist/ directory: h5cpp-dev-v1.12.5-linux-x86_64.{deb,rpm} h5cpp-dev-v1.12.5-linux-aarch64.{deb,rpm} h5cpp-dev-v1.12.5-darwin-arm64.pkg h5cpp-dev-v1.12.5-windows-amd64.exe Without this override, both amd64 and arm64 Linux builds emit identical "h5cpp-dev-1.10.4-Linux.{deb,rpm}" names and overwrite each other on the Release page.

…rom-source Two platform fixes in the same workflow file, both required for the v1.12.5 Package run to produce assets across the full matrix. macOS (#245): The Configure step set HDF5_ROOT from brew --prefix but FindHDF5 on macos-15 / Apple Silicon needed CMAKE_PREFIX_PATH as well to locate the C library, producing: Could NOT find HDF5 (missing: HDF5_LIBRARIES HDF5_INCLUDE_DIRS C) ci.yml's macOS path sets both; package.yml only set one. Symmetrised. Windows (#246): choco install hdf5 returns "package not found" — there is no hdf5 package in the Chocolatey community repo. The #236 fix that introduced this command was based on an incorrect assumption. Replaced with the build-from-source pattern already validated by ci.yml's Windows path: - Build zlib 1.3.1 from source (~30s, not cached) - Restore HDF5 1.12.2 cache if present - Build HDF5 1.12.2 from source if cache-miss (~3-5min, cached) - Save HDF5 cache on miss Configure now consumes both prefixes through CMAKE_PREFIX_PATH. The first run after this change will be slow (~5min for HDF5 build); subsequent runs reuse the cache.

…e-cleanup' into staging

Phase 1.1 of the FAPL multithreading workplan (tasks/h5cpp-fapl-multi threading-workplan.md §3). Lifecycle infrastructure only — the real pool internals (queues, compress dispatch) come in Phase 1.2, and consumer-site wiring (pt_t, h5::write, h5::read) in Phase 1.3. What this commit lands: h5cpp/H5worker_pool.hpp Minimal worker_pool_t. Owns N std::jthreads sized by the constructor argument (0 → hardware_concurrency()). Phase 1.1 workers park on stop_token; Phase 1.2 will replace the placeholder loop with the compress dispatch loop reading from bounded MPMC queues. Non-copyable, non-movable (jthread members + future atomic in_flight). h5cpp/H5Pfapl_threads.hpp The hidden-pointer pattern, shared-ownership variant. Stores a pointer to a heap-allocated worker_pool_slot_t in the FAPL skip list via H5Pinsert2. The slot owns a shared_ptr<worker_pool_t>; the copy callback allocates a new slot whose shared_ptr aliases the same pool — every FAPL copy increments the refcount instead of spawning a fresh pool. Close callback drops one slot; pool dies when the last slot is freed. Defines h5::threads as a user-facing FAPL property: h5::fd_t fd = h5::create("data.h5", H5F_ACC_TRUNC, h5::threads{8}); h5::fd_t fd = h5::create("data.h5", H5F_ACC_TRUNC, h5::threads{}); h5::impl::resolve_worker_pool(fapl_id) is the consumer-site helper that Phase 1.3 will use in pt_t / h5::write / h5::read to look up the pool from the file's FAPL. h5cpp/core Adds H5worker_pool.hpp and H5Pfapl_threads.hpp to the include chain right after H5Pdapl.hpp. Tests (test/H5Pall.cpp, [#250] cases): - Regression scaffold: install the property WITHOUT the copy callback (the broken state) and assert the tracking close callback detects the double-free. Guards against silent test breakage where the real assertion becomes a no-op. Same idea as #244's scaffolding. - Single FAPL clean lifecycle: alloc=del, no leak. - H5Pcopy preserves shared pool ownership: pool pointer is IDENTICAL across two FAPLs that share the property; closing one FAPL doesn't affect the other; refcounts balance. - h5::threads tag installs the property: exercises the real fapl_threads_set callback (not the tracking shim) and verifies h5::impl::resolve_worker_pool returns the pool with the requested worker count. - h5::threads{} (no count) uses hardware_concurrency. Validated locally via a standalone TU compiled against the new headers with HDF5 1.10.9; the slot pattern only requires H5Pinsert2 (HDF5 ≥ 1.8) so it works regardless of the v1.12 floor. CI matrix on this PR will run the ctest cases against HDF5 1.12.2. [#250]:svarga:refactor, consolidate worker pool + FAPL property into H5Pthreads.hpp Renames + merges the two Phase 1.1 headers (h5cpp/H5worker_pool.hpp and h5cpp/H5Pfapl_threads.hpp) into a single h5cpp/H5Pthreads.hpp. Mirrors the H5Pdapl.hpp precedent of having the property type, lifecycle callbacks, and any helper data structures co-located in one header per property family. Contents unchanged — same worker_pool_t, same worker_pool_slot_t, same fapl_pool_copy_cb / fapl_pool_close_cb / fapl_threads_set, same resolve_worker_pool, same h5::threads tag. Only the file layout changes; no API change for users or for the test suite. Updates the include chain in h5cpp/core accordingly. [#250]:svarga:fix, lower H5CPP_HDF5_FLOOR from 1.12.0 to 1.10.4 — CI matrix coverage #247 introduced H5CPP_HDF5_FLOOR = "1.12.0" alongside decoupling the floor from PROJECT_VERSION. Choice of 1.12.0 was too aggressive — the CI matrix runs Ubuntu 22.04 with system HDF5 1.10.7 (floor coverage restored by #235), and every Linux matrix entry that consumes the system package fails at configure with: -- !!! H5CPP requires HDF5 v1.12.0 or greater (found 1.10.7) !!! This change lowers the floor to 1.10.4, matching the prior implicit floor that the stale project(VERSION 1.10.4.6) line encoded before #247 removed it. Result: Ubuntu 22.04's 1.10.7 system package satisfies the check; all matrix entries that previously passed continue to pass. When the project eventually drops 1.10.x coverage, bump this constant deliberately AND remove the 22.04 matrix entry in the same commit, so the floor and the matrix stay in sync. Validated locally with HDF5 1.10.9: cmake configure succeeds, full ctest sweep (50/50) green including the new [#250] FAPL slot-lifecycle regression cases. [#250]:svarga:fix, H5Pthreads.hpp use stoppable_thread_t shim for C++17 compatibility Examples target compiles with C++17 (matching the project's documented minimum standard), but H5Pthreads.hpp used std::jthread / std::stop_token unconditionally — both C++20 features. Every example TU including h5cpp/core failed: H5Pthreads.hpp:91: error: 'std::jthread' is only available from C++20 onwards H5Pthreads.hpp:85: error: 'std::stop_token' has not been declared H5Zpipeline_threaded.hpp already addresses this — it uses h5::detail::stoppable_thread_t and h5::detail::stop_token_t from detail/stoppable_thread.hpp, which are aliases for std::jthread / std::stop_token under C++20 and hand-rolled shims with the same API under C++17. This commit switches H5Pthreads.hpp to the same shim, mirroring the H5Zpipeline_threaded.hpp pattern. No API change for users; the h5::threads tag, worker_pool_t, and slot lifecycle are unaffected. Local ctest sweep: 50/50 green with system HDF5 1.10.9. [#250]:svarga:feature, worker_pool_t generic submit() + wait_idle() API Phase 1.2 of the FAPL multithreading workplan. Replaces the Phase 1.1 placeholder worker loop with a real submit-based pool. Design: - Workers pull type-erased tasks (std::function<void()>) off a single queue protected by std::mutex + std::queue + h5::detail::doorbell_t. Same primitives as H5Zpipeline_threaded.hpp's C++17 fallback path, so the pool works uniformly on C++17 (via stoppable_thread_t shim) and C++20 (via std::jthread + atomic::wait under the doorbell). - submit<F>(callable) returns std::future<invoke_result_t<F>>. Internally uses std::packaged_task wrapped in a shared_ptr so the move-only task can sit inside a copyable std::function in the queue. - wait_idle() blocks until in_flight reaches 0. - dtor calls wait_idle() before letting stoppable_thread_t destructors request stop + join. No work is lost on shutdown. - Pool is HDF5-agnostic. Phase 1.3 will wrap HDF5-specific compress logic in closures and submit() them. Tests (test/H5Pall.cpp, [#250 1.2] cases): - single submit + future return value - many submits resolve in parallel (256 squared-int sums) - multi-thread submission is safe (4 producers × 100 tasks) - wait_idle blocks until completion (50 tasks) - exception in task captured in future - dtor drains pending work cleanly (no work lost) Local sweep: 50/50 ctest green including the new 6 [#250 1.2] cases. [#250]:svarga:refactor, simplify worker_pool_t worker_loop double-check The original worker_loop used a double-check pattern: 1. lock m_, see empty, read bell_.load() → last 2. unlock m_ 3. re-lock m_, check tasks_ again (for race window) 4. if still empty, wait(last) Steps 3-4's re-check is redundant. submit() rings the bell ONLY after releasing the lock that the worker is contending for. The doorbell sequence increments before notify_one(). So between the worker's single lock-release in step 2 and its wait(last) call, any submit() that pushed a task will have rung the bell — bell_.wait(last) sees a sequence > last and returns immediately. No missed wakeup possible. This refactor collapses the pattern to a single critical section that either pops or snapshots the doorbell, then waits outside the lock if no task was popped. Same correctness, easier for sanitizers and future readers to verify. Local ctest still 50/50. [#250]:svarga:fix, coverage CI add -fprofile-update=atomic for thread-safe gcov The coverage job (ubuntu-24.04 / gcc-14) failed at the "Capture coverage" step with: geninfo: ERROR: Unexpected negative count '-3' for thirdparty/libdeflate/v1.25.0/lib/deflate_compress.c:2217. Perhaps you need to compile with '-fprofile-update=atomic Phase 1.2's worker_pool_t spawns std::jthread workers that execute user tasks (test cases submit 256+ tasks across 4-8 workers). Existing tests ALSO exercise multi-threaded libdeflate via #241's threaded_pipeline_t. The combined concurrency saturates gcov's non-atomic counter updates, producing the negative-count corruption above. Fix: add -fprofile-update=atomic to the coverage build's C and C++ flags. This makes gcov instrumentation increment counters atomically. Marginal runtime cost for the coverage build only; production builds are unaffected. This is a coverage-infrastructure fix, not Phase 1.2 logic — but it sits on the Phase I branch because Phase 1.2's threading is what surfaced it. [#250]:svarga:fix, Phase 1.2 tests wait_idle before in_flight() check Coverage CI (gcc-14 / ubuntu-24.04, Debug -O0 + --coverage instrumentation) failed in test-h5pall: test/H5Pall.cpp:387: ERROR: CHECK( pool.in_flight() == 0 ) is NOT correct! values: CHECK( 1 == 0 ) Race in the test, not in the pool: - submit() increments in_flight_ before queueing the task. - worker_loop runs (*task)(), which calls packaged_task::set_value internally — that's when future::wait()/get() unblocks. - worker_loop then decrements in_flight_ AFTER task() returns. There's a small window between "future is ready" and "in_flight is decremented" — workers can be in the gap. Optimized builds usually close it within nanoseconds; coverage's -O0 + atomic gcov makes it wide enough to observe. Fix: add pool.wait_idle() before each in_flight() == 0 assertion. wait_idle() spins until in_flight reaches 0, eliminating the race. The pool's semantics are unchanged. in_flight() remains an approximate introspection metric that can transiently exceed the count of pending futures. Tests that need a precise zero should call wait_idle() first. Local sweep: 50/50 ctest green; the regression test for [#250 1.2] now holds under Debug + --coverage flags. [#250]:svarga:feature, Phase 1.3.1 h5::backpressure FAPL property Adds a separate FAPL property h5::backpressure{N} that bounds the in-flight chunk deque for any pt_t / h5::write / h5::read that uses the file's worker pool. Composes with h5::threads via operator|: h5::fd_t fd = h5::create("data.h5", H5F_ACC_TRUNC, h5::threads{8} | h5::backpressure{32}); Storage: - New H5CPP_FAPL_BACKPRESSURE property, stores a plain unsigned via H5Pinsert2 with null lifecycle callbacks (POD value, memcpy semantics are correct for HDF5's internal copy + close). - fapl_backpressure_set is idempotent (skips if already installed), matching fapl_threads_set behavior. - resolve_backpressure(fapl_id, worker_count) returns: * user-set value when h5::backpressure{N} was applied, * else H5CPP_FAPL_BACKPRESSURE_DEFAULT_FACTOR × worker_count (default factor = 8, override via compile-time macro). h5::backpressure{N} without h5::threads{N} is silently a no-op at the consumer level — without a pool, there's no queue to bound. Resolver still returns the user value if the property is set; pt_t / h5::write ignore it when no pool is present. Tests (test/H5Pall.cpp, [#250 1.3] cases): - h5::backpressure tag installs the property via property chain - Default = 8 × worker_count when unset - Property survives H5Pcopy (POD memcpy semantics) - backpressure-only FAPL: resolver returns user value; no pool is installed (verified separately) Local: 50/50 ctest green. Phase 1.3.2 (next): wire pt_t to read both pool + backpressure cap from the FAPL and enforce the in-flight deque bound. [#250]:svarga:feature, Phase 1.3.2 pt_t resolves FAPL pool + backpressure cap at init Wires pt_t into the FAPL worker-pool plumbing without changing dispatch behavior yet. After this commit: - pt_t has two new private members: std::shared_ptr<impl::worker_pool_t> pool_; unsigned backpressure_cap_; - pt_t::init() calls H5Fget_access_plist on the file id (already held for re-opening the dataset with zero-chunk-cache DAPL), then h5::impl::resolve_worker_pool(fapl) + h5::impl::resolve_backpressure to populate the members. fapl is closed before the existing H5Fclose(fid). - pt_t move-assignment moves pool_ and copies/resets backpressure_cap_. - pt_t copy-constructor (deep copy from another pt_t's ds) leaves both members at their default state — copy of pt_t was always documented as the synchronous-pipeline path, that behavior is unchanged. Dispatch (1.3.2 step 2, next commit) will: - Branch on pool_ being non-null at write_chunk call sites. - Submit compress closures to pool_, push the returned futures into an in-flight deque, drain in order, block on full deque using backpressure_cap_. This commit keeps the existing visit_pipeline() dispatch in place so the change is observable only by the new tests below. All 50 pre- existing ctest cases continue to pass bytewise-identically. Tests (test/H5Dappend.cpp, [#250 1.3.2] cases): - pt_t constructed against a file whose FAPL has h5::threads{4} | h5::backpressure{16}: the FAPL resolves to a 4-worker pool with cap=16 (verified via the same FAPL's resolve_worker_pool and resolve_backpressure helpers). - pt_t with no FAPL pool falls back cleanly (default FAPL): construct, append 32 ints, flush, read back identical bytes. [#250]:svarga:feature, Phase 1.3.2 step 2 — pt_t pool dispatch + async-pipelined drain When pt_t resolves a worker_pool_t from the file's FAPL, every chunk commit now flows through write_chunk_via_pool instead of the variant pipeline. This is Option B from the workplan §3 design notes: async-pipelined producer with FIFO drain on the calling thread. Mechanism: - write_chunk_via_pool snapshots the filter chain config (filter[], flags[], cd_size[], cd_values[], tail) from pt_t's basic_pipeline member into a POD filter_chain_t for cheap by-value capture. - The raw chunk bytes are copied into a unique_ptr<byte[]> the worker will own. Per-chunk allocation; thread-local scratch in the pool is a follow-on optimization if benchmarks justify it. - Closure submitted to pool: applies the filter chain ping-pong style (same logic as basic_pipeline_t::write_chunk_impl), copies final bytes into a fresh owned aligned buffer, returns pool_result_t{data, nbytes, mask, offset}. - The returned future is pushed onto in_flight_ (deque preserves submission order). - drain_in_flight(false) drains all already-completed futures off the front, calling H5Dwrite_chunk on the calling thread for each. HDF5 stays single-threaded. - When in_flight_.size() reaches backpressure_cap_, write_chunk_via_pool blocks on the front future via drain_in_flight(true). Producer memory is bounded by cap × chunk_bytes per pt_t. All six visit_pipeline → write_chunk call sites in this file now route through dispatch_chunk(), a tiny private method that picks pool path when pool_ is non-null and falls back to visit_pipeline otherwise. Synchronous behavior is bytewise-unchanged when no FAPL pool is configured. flush() now drains in_flight_ to completion before returning. Tests (test/H5Dappend.cpp, [#250 1.3.2] cases): - gzip round-trip equivalence: same data written through synchronous, pool-default-cap, and pool-explicit-cap paths produces identical readback in all three cases. - tight back-pressure (cap=2) round-trip: forces the producer-blocking branch every other chunk; data integrity preserved. Local: 50/50 ctest green. What this commit does NOT do (deferred): - h5::write and h5::read pool integration (Phase 1.3.3 — same approach but via h5::get_access_plist(ds) at the call site). - Removal of #241's pt_t(ds, h5::filter::threads{N}) constructor (Phase 1.4 — once we're sure the FAPL path covers everything). - Thread-local scratch buffers in the pool (optimization, post-Phase I). [#250]:svarga:refactor, Phase 1.3.3 part 1 — pool_pipeline_t as third variant alternative Implements workplan Option b: the FAPL worker pool becomes a third CRTP descendant of pipeline_t<Derived>, slotting into pt_t's pipeline variant alongside basic_pipeline_t and threaded_pipeline_t. New header h5cpp/H5Zpipeline_pool.hpp: - pool_pipeline_t owns std::shared_ptr<worker_pool_t> + back-pressure cap + std::deque<std::future<result_t>> in-flight queue. - write_chunk_impl snapshots filter chain config into a POD captured by value, copies raw chunk into a unique_ptr<byte[]> owned by the worker, submits the compress closure to the pool, pushes the future onto in_flight_, drains opportunistically, blocks on the front future when cap is reached. - read_chunk_impl stays synchronous (identical to basic). Parallel decompression is Phase 1.5+ work — requires read-ahead that cooperates with pipeline_t<>::split_to_chunk_read. - public drain() method walks in_flight_ to completion, called by pt_t::flush so users get "data on disk after flush returns". - Destructor drains any remaining in-flight in submission order. pt_t refactor: - pt_pipeline_t variant now has three alternatives: basic, threaded (legacy #241), pool. - init() resolves the FAPL pool; if present, emplaces a pool_pipeline_t alternative; otherwise keeps the default basic_pipeline_t. - All six write_chunk dispatch sites revert to plain visit_pipeline + p.write_chunk — uniform across the three alternatives via the CRTP base. - dispatch_chunk / write_chunk_via_pool / drain_in_flight / pool_ / backpressure_cap_ / in_flight_ all REMOVED from pt_t (the logic now lives in pool_pipeline_t). - flush() invokes drain() on the pool alternative via std::visit type-checked branch. Why this shape (Option b from workplan §3 design notes): - h5::write and h5::read benefit transparently in the next sub-phase via the existing pipeline_t<>::write/read → split_to_chunk_write → write_chunk_impl machinery. Just construct a pool_pipeline_t in those entry points when the file's FAPL has a pool. - The in-flight deque and drain logic live in one place, not duplicated across consumers. - Phase 1.4 (remove #241's threaded_pipeline_t alternative) becomes a single-commit cleanup: the variant goes from three to two alternatives. Local: 50/50 ctest green. Existing Phase 1.3.2 round-trip tests continue to pass (the variant routes them through pool_pipeline_t now instead of pt_t's removed in-flight machinery). What this commit does NOT do (Phase 1.3.3 part 2, next commit): - h5::write / h5::read FAPL pool integration. - Will construct a local pool_pipeline_t in those entry points, call set_cache + pipeline.write/read, let RAII drain on exit. [#250]:svarga:feature, Phase 1.3.3 part 2 — h5::write/read consult FAPL pool When the file's FAPL has h5::threads{N} installed AND the dataset's DAPL has h5::high_throughput AND the dataset is chunked, h5::write and h5::read now construct a local pool_pipeline_t and route the operation through it. Otherwise the existing paths run unchanged: - DAPL high_throughput + no FAPL pool → existing DAPL-stored basic_pipeline_t pointer (synchronous filter chain). - No DAPL high_throughput → standard H5Dwrite / H5Dread. Both branches sit inside the existing use_pipeline gate (which also checks layout == H5D_CHUNKED, the #242 SegFault guard). The FAPL pool is the new inner check, taking priority over the DAPL pipeline pointer when present. Mechanism (h5::write side): hid_t fid = H5Iget_file_id(static_cast<hid_t>(ds)); hid_t fapl = H5Fget_access_plist(fid); auto pool = h5::impl::resolve_worker_pool(fapl); if (pool) { const unsigned cap = h5::impl::resolve_backpressure( fapl, pool->worker_count()); h5::impl::pool_pipeline_t pipe(std::move(pool), cap); // set_cache populates the filter chain from the DCPL. h5::dcpl_t dcpl{H5Dget_create_plist(static_cast<hid_t>(ds))}; hid_t type_id = H5Dget_type(static_cast<hid_t>(ds)); size_t elem_sz = H5Tget_size(type_id); H5Tclose(type_id); pipe.set_cache(dcpl, elem_sz); pipe.write(ds, offset, stride, block, count, dxpl, ptr); // RAII: pipe destructor drains in_flight before pool refcount drop. } else { // ... existing DAPL pipeline path ... } h5::read uses the same shape; pool_pipeline_t::read_chunk_impl is currently synchronous (identical to basic_pipeline_t::read_chunk_impl) so the FAPL-pool read branch is semantically equivalent to the DAPL read path today. The structure is in place for the Phase 1.5+ read-ahead optimization to land without touching call sites. Tests (test/H5Pall.cpp, [#250 1.3.3] cases): - h5::write + h5::read round-trip through FAPL pool with h5::threads{4} | h5::backpressure{16} + DAPL high_throughput + gzip-compressed dataset. Reader uses default FAPL — verifies data is durably on disk regardless of reader-side FAPL choice. - h5::write without high_throughput on the DAPL bypasses the pool even when the FAPL has one installed. Confirms per-dataset opt-in still gates the pool path. Local: 50/50 ctest green. What this commit does NOT do (Phase 1.4, next commit): - Remove pt_t(ds, h5::filter::threads{N}) constructor from #241. - The threaded_pipeline_t variant alternative goes away with it. [#250]:svarga:refactor, Phase 1.4 — retire #241 h5::filter::threads + threaded_pipeline_t Migration option B (chosen earlier in #250): the FAPL-scoped pool from Phases 1.1–1.3 fully subsumes the per-pt_t worker pool added in #241. Two parallel paths invite contention bugs and confuse the surface; one codepath survives. Removed: - h5cpp/H5Zpipeline_threaded.hpp (deleted). This carried threaded_pipeline_t — both the C++20 (jthread + lock-free queue) and C++17 (stoppable_thread + std::mutex + std::queue) variants — plus the h5::filter::threads tag. - h5::pt_t( const h5::ds_t&, h5::filter::threads ). The remaining pt_t( const h5::ds_t& ) ctor now resolves the file's FAPL and swaps to pool_pipeline_t when h5::threads{N} is installed. - impl::pt_pipeline_t third alternative (unique_ptr<threaded_pipeline_t>). The variant is now { basic_pipeline_t, pool_pipeline_t } — both unique_ptr-wrapped for move-assignment. - core: #include "H5Zpipeline_threaded.hpp" line removed. Stale "threaded_pipeline_t is defined in ..." comment in H5Zpipeline.hpp also removed. Tests: - test/H5Zpipeline.cpp: dropped the three direct threaded_pipeline_t round-trip cases (no filter / gzip / shuffle+gzip). Pool round-trip is covered in test/H5Pall.cpp via the [#250 1.3.3] cases. - test/H5Dappend.cpp: dropped the four [#241] cases (tag construction, no-filter round-trip, basic-vs-threaded bytewise equivalence, default hw_concurrency worker count). Replaced with a one-line comment pointing to the FAPL coverage in test/H5Pall.cpp. User-visible API change (intentional, since v1.12.4 of #241 was the only release that shipped it): // before (#241): h5::pt_t pt(ds, h5::filter::threads{4}); // now (#250): h5::fd_t fd = h5::create(path, H5F_ACC_TRUNC, h5::threads{4} | h5::backpressure{16}); // ... dataset created on fd inherits the pool ... h5::pt_t pt(ds); // resolves fd's pool automatically Verified locally: 50/50 ctest green. CI will validate across the asan / ubsan / coverage / macos / windows matrix on push. [#250]:svarga:test, Phase 1.5 — TSAN job + [#250 1.5] FAPL pool race coverage CI: add a `tsan / ubuntu-24.04 / clang-20` job mirroring the existing asan / ubsan jobs. It configures clang-20 with `-fsanitize=thread -fno-omit-frame-pointer -O1 -g`, builds the whole tree, and runs the full ctest suite under `TSAN_OPTIONS=halt_on_error=1:second_deadlock_stack=1`. Failing races now break CI loud the same way ASan/UBSan errors do. Badge generator's `needs:` list extended to include the new job so the TSan status SVG renders alongside the others. Tests: four new cases tagged `[#250 1.5]` in test/H5Pall.cpp, sized to exercise the FAPL pool path the way race-prone bugs would surface: - "two FAPLs with independent pools — bytewise equivalence to basic": same payload written via basic pipeline + 4-thread pool + 2-thread pool; all three readbacks must equal the input. Catches any state leakage between pool instances or between the pool and basic paths. - "interleaved writes through two FAPLs do not cross-contaminate": two open fds with independent pools alive at the same time, distinct payloads written sequentially (HDF5 itself isn't thread- safe in default builds, so concurrent H5 C-API calls are out of scope; the pool only parallelizes compression). Chunks routed through one pool must not appear in the other file. - "pt_t destructor drains pool before fd close — data on disk": no explicit h5::flush; lets ~pt_t fire the pool drain. Reads back from a fresh open verify the destructor honored the "data on disk after scope exit" contract. - "fd close after FAPL pool flush releases workers": opens / writes / closes the file four times back-to-back; each iteration drops its own shared_ptr<worker_pool_t> via the FAPL slot's close-cb. Catches use-after-free in the slot lifecycle. Local validation (Ubuntu 22.04 host, g++ 11 with `-fsanitize=thread`): cmake -B build-tsan -DCMAKE_BUILD_TYPE=Debug \ -DCMAKE_CXX_FLAGS="-fsanitize=thread -fno-omit-frame-pointer -O1 -g" \ -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=thread" \ -DH5CPP_BUILD_TESTS=ON cmake --build build-tsan -j TSAN_OPTIONS=halt_on_error=1 ./build-tsan/test-h5pall TSAN_OPTIONS=halt_on_error=1 ./build-tsan/test-h5dappend TSAN_OPTIONS=halt_on_error=1 ./build-tsan/test-h5zpipeline → h5pall 30/30, h5dappend 18/18, h5zpipeline 10/10. No data races. CI clang-20 will validate the same on the matrix. [#250]:svarga:docs, Phase 1.6 — README + pipeline rework report updated for FAPL pool - README v1.12 highlights: replace "Threaded I/O pipeline" bullet with the user-visible FAPL pool API (`h5::create(..., h5::threads{N} | h5::backpressure{M})`) and bump the sanitizer line from "ASan + UBSan" to "ASan + UBSan + TSan" now that the TSan job is wired into CI. - docs/filtering-pipeline-rework-report.md: - Update the Threading row of the gap table to mark Phase I delivered as `pool_pipeline_t` in #250. - Append a "Status — Phase I (#250, FAPL worker pool)" section describing the surface, the mechanism (shared_ptr slot in FAPL, pool_pipeline_t in h5::write / h5::read / pt_t), bounded back-pressure, and the removal of the #241 per-pt_t `h5::filter::threads{N}` constructor. Phase II tracking is deferred — that lives in tasks/h5cpp-fapl-multithreading-workplan.md. [#250]:svarga:fix, Windows MSVC h5pall — distinct paths per iteration of fd close test The Phase 1.5 case "fd close after FAPL pool flush releases workers" re-created the same file four times in a loop. On Windows / HDF5 1.12.2 the C runtime briefly holds the previous handle past the HDF5-level close, so the second `H5Fcreate` raced the deletion of the first file and threw "couldn't create file" at H5Fcreate.hpp:58. Replaces the in-place re-create loop with a list of four distinct paths. The slot-lifecycle behaviour being asserted — FAPL property close-cb drops the pool ref when the last fd dies — still fires once per iteration; only the filesystem reuse is gone. Linux/macOS handle in-place re-creation fine (POSIX delete is atomic), so the test ran green on every other CI matrix entry plus local h5pall (30/30). This is a Windows-only fragility fix, not a correctness change in the FAPL pool itself.

… from type engine

codecov · 2026-05-19T21:36:41Z

Codecov Report

❌ Patch coverage is 45.02165% with 127 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
h5cpp/H5Zpipeline_pool.hpp	0.00%	97 Missing ⚠️
h5cpp/H5Dread.hpp	0.00%	17 Missing ⚠️
h5cpp/H5Dwrite.hpp	47.05%	9 Missing ⚠️
h5cpp/H5Dappend.hpp	85.18%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

steven-varga added 7 commits May 18, 2026 01:51

Merge remote-tracking branch 'origin/245-fix-v1125-packaging-and-badg…

1663abb

…e-cleanup' into staging

Merge branch '250-feature-fapl-worker-pool' into staging

5f2ed82

[#254]:svarga:refactor, remove compiler_meta_t and field_descriptor_t…

292b21d

… from type engine

steven-varga force-pushed the 254-refactor-remove-compiler-meta branch from c2ad70e to 292b21d Compare May 19, 2026 21:31

This was referenced May 19, 2026

refactor, access_traits_t and is_transport_contiguous for plain aggregates #256

Open

[#256]:svarga:refactor, access_traits_t and is_transport_contiguous for plain aggregates #257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#254]:svarga:refactor, remove compiler_meta_t and field_descriptor_t from type engine#255

[#254]:svarga:refactor, remove compiler_meta_t and field_descriptor_t from type engine#255
steven-varga wants to merge 7 commits into
releasefrom
254-refactor-remove-compiler-meta

steven-varga commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

steven-varga commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant