fix(joss_paper/plaid_architecture.md): remove HTML line break tags and minor text adjustments

TopAgrume · TopAgrume · commit 377be8fbd90b · 2025-06-07T21:56:05.000+02:00
diff --git a/docs/joss_paper/paper.md b/docs/joss_paper/paper.md
@@ -40,7 +40,7 @@ By promoting a common standard, PLAID makes physics data interoperable across pr
 
 # Functionality
 
-* **Data Model and Formats:** A PLAID dataset is organized within a root folder (or archive), distinctly separating simulation data from machine learning task definitions, as illustrated in \autoref{fig:plaid_dataset_architecture}. The `dataset/` directory contains numbered sample subfolders (`sample_000...`), each holding one or more `.cgns` files under `meshes/` and a `scalars.csv`. The `dataset/infos.yaml` file contains human-readable descriptions and metadata.  A `problem_definition/` folder includes `problem_infos.yaml` (specifying the ML task inputs/outputs) and `split.csv` (defining train/test splits).  This design supports time evolution and multi-block/multi-geometry problems out of the box.
+* **Data Model and Formats:** A PLAID dataset is organized within a root folder (or archive), distinctly separating simulation data from machine learning task definitions, as illustrated in \autoref{fig:plaid_dataset_architecture}. The `dataset/` directory contains numbered sample subfolders (`sample_000...`), each holding one or more `.cgns` files under `meshes/` and a `scalars.csv`. The `dataset/infos.yaml` file contains human-readable descriptions and metadata.  The `problem_definition/` folder provides machine learning context. It includes `problem_infos.yaml` (specifying the ML task inputs/outputs) and `split.csv` (defining train/test splits).  This design supports time evolution and multi-block/multi-geometry problems out of the box.
 
 ![Overview of the PLAID dataset architecture.\label{fig:plaid_dataset_architecture}](plaid_architecture.png){ width=80% }
 
@@ -50,9 +50,6 @@ By promoting a common standard, PLAID makes physics data interoperable across pr
 
 * **Utilities:** PLAID includes helper modules for common tasks in data science workflows. The `plaid.utils.split` module provides a `split_dataset` function to partition data into training/validation/testing subsets according to user-defined ratios. The `plaid.utils.interpolation` module implements piecewise linear interpolation routines (and fast vectorized search) to resample time series fields or align datasets with different timesteps. The `plaid.utils.stats` module offers an `OnlineStatistics` class to compute running statistics (min, mean, variance, etc.) on arrays, which can be used to analyze dataset distributions. Moreover, a “Hugging Face bridge” (`plaid.bridges.huggingface_bridge`) enables converting PLAID datasets to/from Hugging Face Dataset objects.
 
-<br>
-<br>
-
 # Usage and Applications
 
 PLAID is designed for AI/ML researchers and practitioners working with simulation data. Various datasets, including 2D/3D fluid and structural simulations, are provided in PLAID format in [Hugging Face](https://huggingface.co/PLAID-datasets) and [Zenodo](https://zenodo.org/communities/plaid_datasets). Interactive benchmarks are hosted in a [Hugging Face community](https://huggingface.co/PLAIDcompetitions) on these datasets, providing detailed instructions and PLAID commands for data retrieval and manipulation (see [@casenave2025physics]). These datasets are also used in recent publications to illustrate the performance of the proposed scientific ML methods. In [@casenave2024mmgp; @kabalan2025elasticity; @kabalan2025ommgp],  Gaussian-process regression methods with mesh morphing are applied to these datasets. In [@perez2024gaussian; @perez2024learning] the datasets are leveraged in graph-kernel regression methods applied to fluid/solid mechanics.