Skip to content

Commit ff52a8d

Browse files
Random-access for variable-based encoding (i.e. use SetStepSelection for ADIOS2 steps) (#1706)
* Add preparseSnapshots * Serial ADIOS2 implementation for readAttributeAllsteps * Feature-complete, untested * Fixes * wip: parallel impl. for readAttributeAllsteps huh this works?? * Simplify, cleanup * wip testing * CI fixes * Fix test * Add MPI_CHECK * Stricter warning about group-based encoding in BP5 * variableBased as default in ADIOS2 * Add error check for non-homogeneous datasets * Reset steps before reading the rankTable * Use writeIterations() in MPI Benchmark * Use group tables by default * Add iterate_nonstreaming_series test to parallel tests This covers the use case that the snapshot attribute has more than just one entry. * Remove std::cout calls * wip: use variable-based encoding in tests * CI fixes * Fix Windows testing * Cleanup * Warn unimplemented modifiable attributes for now * Fix test??? * Add this to variableBasedSeries test * Remove debugging line * Use own flag for the warning Previously implementation did not work: ornladios/ADIOS2#4466 * Documentation * Skip inhomogeneous datasets instead of outright failing * add test for default iteration encoding * Move test to its own file * Don't distinguish ADIOS2 v2.9 any more * Some more documentation
1 parent a4d66cc commit ff52a8d

32 files changed

+2045
-379
lines changed

CMakeLists.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -787,6 +787,15 @@ if(openPMD_BUILD_TESTING)
787787
test/Files_SerialIO/close_and_reopen_test.cpp
788788
test/Files_SerialIO/filebased_write_test.cpp
789789
)
790+
elseif(${test_name} STREQUAL "ParallelIO" AND openPMD_HAVE_MPI)
791+
list(APPEND ${out_list}
792+
test/Files_ParallelIO/read_variablebased_randomaccess.cpp
793+
test/Files_ParallelIO/iterate_nonstreaming_series.cpp
794+
)
795+
elseif(${test_name} STREQUAL "Core")
796+
list(APPEND ${out_list}
797+
test/Files_Core/automatic_variable_encoding.cpp
798+
)
790799
endif()
791800
endmacro()
792801

docs/source/usage/concepts.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ openPMD-api implements various file-formats (backends) and encoding strategies f
4848
**Iteration encoding:** The openPMD-api can encode iterations in different ways.
4949
The method ``Series::setIterationEncoding()`` (C++) or ``Series.set_iteration_encoding()`` (Python) may be used in writing for selecting one of the following encodings explicitly:
5050

51-
* **group-based iteration encoding:** This encoding is the default.
51+
* **group-based iteration encoding:** This encoding is the default for HDF5 and JSON. In ADIOS2, variable-based encoding is preferred when possible due to better performance characteristics, see below.
5252
It creates a separate group in the hierarchy of the openPMD standard for each iteration.
5353
As an example, all data pertaining to iteration 0 may be found in group ``/data/0``, for iteration 100 in ``/data/100``.
5454
* **file-based iteration encoding:** A unique file on the filesystem is created for each iteration.
@@ -57,6 +57,7 @@ The method ``Series::setIterationEncoding()`` (C++) or ``Series.set_iteration_en
5757
A padding may be specified by ``"series_%06T.json"`` to create files ``series_000000.json``, ``series_000100.json`` and ``series_000200.json``.
5858
The inner group layout of each file is identical to that of the group-based encoding.
5959
* **variable-based iteration encoding:** This experimental encoding uses a feature of some backends (i.e., ADIOS2) to maintain datasets and attributes in several versions (i.e., iterations are stored inside *variables*).
60+
When creating an ADIOS2 Series with steps (e.g. via ``series.writeIterations()`` / ``series.write_iterations()``), this encoding will be picked as a default instead of group-based encoding due to bad performance characteristics of group-based encoding in ADIOS2.
6061
No iteration-specific groups are created and the corresponding layer is dropped from the openPMD hierarchy.
6162
In backends that do not support this feature, a series created with this encoding can only contain one iteration.
6263

docs/source/usage/workflow.rst

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -76,10 +76,14 @@ The openPMD-api distinguishes between a number of different access modes:
7676
3. In streaming backends, random-access is not possible.
7777
When using such a backend, the access mode will be coerced automatically to *linear read mode*.
7878
Use of Series::readIterations() is mandatory for access.
79-
4. Reading a variable-based Series is only fully supported with *linear access mode*.
80-
If using *random-access read mode*, the dataset will be considered to only have one single step.
81-
If the dataset only has one single step, this is guaranteed to work as expected.
82-
Otherwise, it is undefined which step's data is returned.
79+
4. *Random-access read mode* for a variable-based Series is currently experimental.
80+
There is currently only very restricted support for metadata definitions that change across steps:
81+
82+
1. Modifiable attributes (except ``/data/snapshot``) can currently not be read. Attributes such as ``/data/time`` that naturally change their value across Iterations will hence not be really well-usable; the last Iteration's value will currently leak into all other Iterations.
83+
2. There is no support for datasets that do not exist in all Iterations. The internal Iteration layouts should be homogeneous.
84+
If you need this feature, please contact the openPMD-api developers; implementing this is currently not a priority.
85+
Datasets that do not exist in all steps will be skipped at read time (with an error).
86+
3. Datasets with changing extents are supported.
8387

8488
* **Read/Write mode**: Creates a new Series if not existing, otherwise opens an existing Series for reading and writing.
8589
New datasets and iterations will be inserted as needed.

examples/5_write_parallel.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,7 @@
5656
# in streaming setups, e.g. an iteration cannot be opened again once
5757
# it has been closed.
5858
# `Series.iterations` can be directly accessed in random-access workflows.
59-
series.iterations[1].open()
60-
mymesh = series.iterations[1]. \
59+
mymesh = series.write_iterations()[1]. \
6160
meshes["mymesh"]
6261

6362
# example 1D domain decomposition in first index
@@ -92,7 +91,7 @@
9291
# The iteration can be closed in order to help free up resources.
9392
# The iteration's content will be flushed automatically.
9493
# An iteration once closed cannot (yet) be reopened.
95-
series.iterations[1].close()
94+
series.write_iterations()[1].close()
9695

9796
if 0 == comm.rank:
9897
print("Dataset content has been fully written to disk")

include/openPMD/Datatype.hpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@ inline bool isFloatingPoint(Datatype d)
513513
* @param d Datatype to test
514514
* @return true if complex floating point, otherwise false
515515
*/
516-
inline bool isComplexFloatingPoint(Datatype d)
516+
constexpr inline bool isComplexFloatingPoint(Datatype d)
517517
{
518518
using DT = Datatype;
519519

@@ -554,7 +554,7 @@ inline bool isFloatingPoint()
554554
* @return true if complex floating point, otherwise false
555555
*/
556556
template <typename T>
557-
inline bool isComplexFloatingPoint()
557+
constexpr inline bool isComplexFloatingPoint()
558558
{
559559
Datatype dtype = determineDatatype<T>();
560560

include/openPMD/IO/ADIOS/ADIOS2Auxiliary.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,8 @@ namespace adios_defaults
9797
constexpr const_str str_usesteps = "usesteps";
9898
constexpr const_str str_flushtarget = "preferred_flush_target";
9999
constexpr const_str str_usesstepsAttribute = "__openPMD_internal/useSteps";
100+
constexpr const_str str_useModifiableAttributes =
101+
"__openPMD_internal/useModifiableAttributes";
100102
constexpr const_str str_adios2Schema =
101103
"__openPMD_internal/openPMD2_adios2_schema";
102104
constexpr const_str str_isBoolean = "__is_boolean__";

include/openPMD/IO/ADIOS/ADIOS2File.hpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,8 @@ struct DatasetReader
7979
BufferedGet &bp,
8080
adios2::IO &IO,
8181
adios2::Engine &engine,
82-
std::string const &fileName);
82+
std::string const &fileName,
83+
std::optional<size_t> stepSelection);
8384

8485
static constexpr char const *errorMsg = "ADIOS2: readDataset()";
8586
};
@@ -412,6 +413,8 @@ class ADIOS2File
412413
StreamStatus streamStatus = StreamStatus::OutsideOfStep;
413414

414415
size_t currentStep();
416+
void setStepSelection(std::optional<size_t>);
417+
[[nodiscard]] std::optional<size_t> stepSelection() const;
415418

416419
private:
417420
ADIOS2IOHandlerImpl *m_impl;
@@ -420,8 +423,11 @@ class ADIOS2File
420423
/*
421424
* Not all engines support the CurrentStep() call, so we have to
422425
* implement this manually.
426+
* Note: We don't use a std::optional<size_t> here since the currentStep
427+
* is always being counted.
423428
*/
424429
size_t m_currentStep = 0;
430+
bool useStepSelection = false;
425431

426432
/*
427433
* ADIOS2 does not give direct access to its internal attribute and

include/openPMD/IO/ADIOS/ADIOS2IOHandler.hpp

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
#include "openPMD/Error.hpp"
2424
#include "openPMD/IO/ADIOS/ADIOS2Auxiliary.hpp"
2525
#include "openPMD/IO/ADIOS/ADIOS2FilePosition.hpp"
26+
#include "openPMD/IO/ADIOS/macros.hpp"
2627
#include "openPMD/IO/AbstractIOHandler.hpp"
2728
#include "openPMD/IO/AbstractIOHandlerImpl.hpp"
2829
#include "openPMD/IO/AbstractIOHandlerImplCommon.hpp"
@@ -190,6 +191,9 @@ class ADIOS2IOHandlerImpl
190191

191192
void readAttribute(Writable *, Parameter<Operation::READ_ATT> &) override;
192193

194+
void readAttributeAllsteps(
195+
Writable *, Parameter<Operation::READ_ATT_ALLSTEPS> &) override;
196+
193197
void listPaths(Writable *, Parameter<Operation::LIST_PATHS> &) override;
194198

195199
void
@@ -431,7 +435,8 @@ class ADIOS2IOHandlerImpl
431435
Offset const &offset,
432436
Extent const &extent,
433437
adios2::IO &IO,
434-
std::string const &varName)
438+
std::string const &varName,
439+
std::optional<size_t> stepSelection)
435440
{
436441
{
437442
auto requiredType = adios2::GetType<T>();
@@ -458,6 +463,10 @@ class ADIOS2IOHandlerImpl
458463
throw std::runtime_error(
459464
"[ADIOS2] Internal error: Failed opening ADIOS2 variable.");
460465
}
466+
if (stepSelection.has_value())
467+
{
468+
var.SetStepSelection({*stepSelection, 1});
469+
}
461470
// TODO leave this check to ADIOS?
462471
adios2::Dims shape = var.Shape();
463472
auto actualDim = shape.size();
@@ -533,11 +542,8 @@ namespace detail
533542
struct AttributeReader
534543
{
535544
template <typename T>
536-
static Datatype call(
537-
ADIOS2IOHandlerImpl &,
538-
adios2::IO &IO,
539-
std::string name,
540-
Attribute::resource &resource);
545+
static Datatype
546+
call(adios2::IO &IO, std::string name, Attribute::resource &resource);
541547

542548
template <int n, typename... Params>
543549
static Datatype call(Params &&...);
@@ -562,7 +568,8 @@ namespace detail
562568
ADIOS2IOHandlerImpl *impl,
563569
InvalidatableFile const &,
564570
std::string const &varName,
565-
Parameter<Operation::OPEN_DATASET> &parameters);
571+
Parameter<Operation::OPEN_DATASET> &parameters,
572+
std::optional<size_t> stepSelection);
566573

567574
static constexpr char const *errorMsg = "ADIOS2: openDataset()";
568575
};
@@ -854,6 +861,11 @@ class ADIOS2IOHandler : public AbstractIOHandler
854861
return "ADIOS2";
855862
}
856863

864+
bool fullSupportForVariableBasedEncoding() const override
865+
{
866+
return true;
867+
}
868+
857869
std::future<void> flush(internal::ParsedFlushParams &) override;
858870
}; // ADIOS2IOHandler
859871
} // namespace openPMD

include/openPMD/IO/AbstractIOHandler.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,7 @@ class AbstractIOHandler
250250

251251
/** The currently used backend */
252252
virtual std::string backendName() const = 0;
253+
virtual bool fullSupportForVariableBasedEncoding() const;
253254

254255
std::string directory;
255256
/*

include/openPMD/IO/AbstractIOHandlerImpl.hpp

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,13 @@ class AbstractIOHandlerImpl
7272
* IO actions up to the point of closing a step must be performed now.
7373
*
7474
* The advance mode is determined by parameters.mode.
75+
* parameters.mode has type std::variant<AdvanceMode, StepSelection>:
76+
*
77+
* 1. AdvanceMode is for processing steps sequentially. In this case, a step
78+
* is either begun or closed.
79+
* 2. StepSelection is for random-accessing steps. A target step number is
80+
* specified.
81+
*
7582
* The return status code shall be stored as parameters.status.
7683
*/
7784
virtual void advance(Writable *, Parameter<Operation::ADVANCE> &parameters)
@@ -360,6 +367,24 @@ class AbstractIOHandlerImpl
360367
*/
361368
virtual void
362369
readAttribute(Writable *, Parameter<Operation::READ_ATT> &) = 0;
370+
/** Collective task to read modifiable attributes over steps.
371+
*
372+
* Has a default implementation for backends that do not support steps;
373+
* here, the task is relayed to normal READ_ATT.
374+
* This task is key for implementing the preparsing logic needed in
375+
* random-access read mode for variable-encoded ADIOS2 files.
376+
* adios2::Mode::ReadRandomAccess does not support modifiable attributes,
377+
* so this task will instead quickly open the file's metadata in
378+
* adios2::Mode::Read, go through all its steps and register the attribute
379+
* values. Expensive and collective operation, run only once at startup.
380+
* Absolutely necessary for reading /data/snapshot.
381+
* Necessary (but not yet used) for having correct values in attributes
382+
* such as /data/time.
383+
* In future: Let this task preparse the entirety of all modifiable
384+
* attributes.
385+
*/
386+
virtual void readAttributeAllsteps(
387+
Writable *, Parameter<Operation::READ_ATT_ALLSTEPS> &);
363388
/** List all paths/sub-groups inside a group, non-recursively.
364389
*
365390
* The operation should fail if the Writable was not marked written.

0 commit comments

Comments
 (0)