Skip to content

Commit 377be8f

Browse files
committed
fix(joss_paper/plaid_architecture.md): remove HTML line break tags and minor text adjustments
1 parent 10a4c20 commit 377be8f

File tree

1 file changed

+1
-4
lines changed

1 file changed

+1
-4
lines changed

docs/joss_paper/paper.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ By promoting a common standard, PLAID makes physics data interoperable across pr
4040

4141
# Functionality
4242

43-
* **Data Model and Formats:** A PLAID dataset is organized within a root folder (or archive), distinctly separating simulation data from machine learning task definitions, as illustrated in \autoref{fig:plaid_dataset_architecture}. The `dataset/` directory contains numbered sample subfolders (`sample_000...`), each holding one or more `.cgns` files under `meshes/` and a `scalars.csv`. The `dataset/infos.yaml` file contains human-readable descriptions and metadata. A `problem_definition/` folder includes `problem_infos.yaml` (specifying the ML task inputs/outputs) and `split.csv` (defining train/test splits). This design supports time evolution and multi-block/multi-geometry problems out of the box.
43+
* **Data Model and Formats:** A PLAID dataset is organized within a root folder (or archive), distinctly separating simulation data from machine learning task definitions, as illustrated in \autoref{fig:plaid_dataset_architecture}. The `dataset/` directory contains numbered sample subfolders (`sample_000...`), each holding one or more `.cgns` files under `meshes/` and a `scalars.csv`. The `dataset/infos.yaml` file contains human-readable descriptions and metadata. The `problem_definition/` folder provides machine learning context. It includes `problem_infos.yaml` (specifying the ML task inputs/outputs) and `split.csv` (defining train/test splits). This design supports time evolution and multi-block/multi-geometry problems out of the box.
4444

4545
![Overview of the PLAID dataset architecture.\label{fig:plaid_dataset_architecture}](plaid_architecture.png){ width=80% }
4646

@@ -50,9 +50,6 @@ By promoting a common standard, PLAID makes physics data interoperable across pr
5050

5151
* **Utilities:** PLAID includes helper modules for common tasks in data science workflows. The `plaid.utils.split` module provides a `split_dataset` function to partition data into training/validation/testing subsets according to user-defined ratios. The `plaid.utils.interpolation` module implements piecewise linear interpolation routines (and fast vectorized search) to resample time series fields or align datasets with different timesteps. The `plaid.utils.stats` module offers an `OnlineStatistics` class to compute running statistics (min, mean, variance, etc.) on arrays, which can be used to analyze dataset distributions. Moreover, a “Hugging Face bridge” (`plaid.bridges.huggingface_bridge`) enables converting PLAID datasets to/from Hugging Face Dataset objects.
5252

53-
<br>
54-
<br>
55-
5653
# Usage and Applications
5754

5855
PLAID is designed for AI/ML researchers and practitioners working with simulation data. Various datasets, including 2D/3D fluid and structural simulations, are provided in PLAID format in [Hugging Face](https://huggingface.co/PLAID-datasets) and [Zenodo](https://zenodo.org/communities/plaid_datasets). Interactive benchmarks are hosted in a [Hugging Face community](https://huggingface.co/PLAIDcompetitions) on these datasets, providing detailed instructions and PLAID commands for data retrieval and manipulation (see [@casenave2025physics]). These datasets are also used in recent publications to illustrate the performance of the proposed scientific ML methods. In [@casenave2024mmgp; @kabalan2025elasticity; @kabalan2025ommgp], Gaussian-process regression methods with mesh morphing are applied to these datasets. In [@perez2024gaussian; @perez2024learning] the datasets are leveraged in graph-kernel regression methods applied to fluid/solid mechanics.

0 commit comments

Comments
 (0)