Skip to content

【Hackathon 8th No.13】Domino 论文复现 #1093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ PaddleScience 是一个基于深度学习框架 PaddlePaddle 开发的科学计
| 热仿真 | [1D 换热器热仿真](https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/heat_exchanger) | 机理驱动 | PI-DeepONet | 无监督学习 | - | - |
| 热仿真 | [2D 热仿真](https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/heat_pinn) | 机理驱动 | PINN | 无监督学习 | - | [Paper](https://arxiv.org/abs/1711.10561)|
| 热仿真 | [2D 芯片热仿真](https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/chip_heat) | 机理驱动 | PI-DeepONet | 无监督学习 | - | [Paper](https://doi.org/10.1063/5.0194245)|
| 外流空气动力学 | [DoMINO](https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/domino) | 数据驱动 | FNO | 监督学习 | [Data](https://caemldatasets.org/drivaerml/) | [Paper](https://arxiv.org/abs/2501.13350)|

<br>
<p align="center"><b>材料科学(AI for Material)</b></p>
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@
| 热仿真 | [1D 换热器热仿真](./zh/examples/heat_exchanger.md) | 机理驱动 | PI-DeepONet | 无监督学习 | - | - |
| 热仿真 | [2D 热仿真](./zh/examples/heat_pinn.md) | 机理驱动 | PINN | 无监督学习 | - | [Paper](https://arxiv.org/abs/1711.10561)|
| 热仿真 | [2D 芯片热仿真](./zh/examples/chip_heat.md) | 机理驱动 | PI-DeepONet | 无监督学习 | - | [Paper](https://doi.org/10.1063/5.0194245)|
| 外流空气动力学 | [DoMINO](./zh/examples/domino.md) | 数据驱动 | FNO | 监督学习 | [Data](https://caemldatasets.org/drivaerml/) | [Paper](https://arxiv.org/abs/2501.13350)|

<br>
<p align="center"><b>材料科学(AI for Material)</b></p>
Expand Down
89 changes: 89 additions & 0 deletions docs/zh/examples/domino.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# DoMINO

=== "模型训练命令"

``` sh
cd examples/domino

# 1. Download the DrivAer ML dataset using the provided download_aws_dataset.sh script or using the Hugging Face repo(https://huggingface.co/datasets/neashton/drivaerml).
sh download_aws_dataset.sh

# 2. Specify the configuration settings in `examples/domino/conf/config.yaml`.

# 3. Run process_data.py. This will process VTP/VTU files and save them as npy for faster processing in DoMINO datapipe. Modify data_processor key in config file. Additionally, run cache_data.py to save outputs of DoMINO datapipe in the .npy files. The DoMINO datapipe is set up to calculate Signed Distance Field and Nearest Neighbor interpolations on-the-fly during training. Caching will save these as a preprocessing step and should be used in cases where the STL surface meshes are upwards of 30 million cells. The final processed dataset should be divided and saved into 2 directories, for training and validation. Specify these directories in conf/config.yaml.
python3 process_data.py

# 4. run train
python3 train.py
```

=== "模型评估命令"

暂无

=== "模型导出命令"

暂无

=== "模型推理命令"

``` sh
cd examples/domino
python3 test.py
```

## 1. 背景简介

外部空气动力学涉及高雷诺数Navier-Stokes方程求解,传统CFD方法计算成本高昂。神经算子通过端到端映射提升了效率,但面临多尺度耦合建模与长期预测稳定性不足的挑战。Decomposable Multi-scale Iterative Neural Operator(Domino)提出可分解多尺度架构,通过分层特征解耦、迭代残差校正及参数独立编码,显著提升跨尺度流动建模精度与泛化能力。实验显示,其计算速度较CFD快2-3个量级,分离流预测精度较FNO等模型提升约40%,为飞行器设计等工程问题提供高效解决方案。

## 2. 模型原理
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档可以参考https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/examples/drivaernetplusplus/
进行修改,论文的主要结论需要体现

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改


DOMINO (Decomposable Multi-scale Iterative Neural Operator)是一种新颖的机器学习模型架构,旨在解决大规模工程仿真代理建模中的挑战。它是一个基于点云的机器学习模型,利用局部几何信息来预测离散点上的流场 。

以下是DOMINO模型的主要原理:

- 全局几何表示学习(Global Geometry Representation):
- 模型首先以几何体的三维表面网格作为输入。
- 在几何体周围构建一个紧密贴合的表面包围盒和一个表示计算域的包围盒。
- 几何点云的特征(如空间坐标)通过可学习的点卷积核投影到表面包围盒上的N维结构化网格上(分辨率为$m×m×m×f$)。
- 点卷积核的实现使用了NVIDIA Warp加速的自定义球查询层 。
- 通过两种方法将几何特征传播到计算域包围盒中:1)学习一组单独的多尺度点卷积核,将几何信息投影到计算域网格上;2)使用包含卷积、池化和反池化层的CNN块,将表面包围盒网格上的特征$G_s$​传播到计算域包围盒网格$G_c$。CNN块会迭代评估。
- 计算域网格上计算出的$m×m×m×f$特征代表了几何点云的全局编码。此外,还会计算符号距离场(SDF)及其梯度分量,并附加到学习到的特征中,以提供关于几何拓扑的额外信息。

- 局部几何表示(Local Geometry Representation):
- 局部几何表示取决于计算域中评估解场的物理位置。
- 在计算局部几何表示之前,会在计算域中采样一批离散点。
- 对于批次中每个采样点,在其周围定义一个大小为$l×l×l$的子区域,并计算局部几何编码。
- 局部编码本质上是全局编码的一个子集,取决于其在计算域中的位置,并通过点卷积计算。
- 提取的局部特征通过全连接神经网络进一步转换。
- 这种局部几何表示用于使用聚合网络评估采样点上的解场。

- 聚合网络(Aggregation Network):
- 局部几何表示代表了采样点及其邻居的计算模板附近几何和解的学习特征。
- 计算模板中的每个点都由其在计算域中的物理坐标、这些坐标处的SDF、来自域质心的法向量以及表面法向量(如果点在表面上)表示。
- 这些输入特征通过一个全连接神经网络(称为基函数神经网络),计算出一个潜在向量,代表计算模板中每个点的这些特征。
- 每个潜在向量与局部几何编码连接,并通过另一组全连接层,以预测计算模板中每个点上的解向量。
- 解向量通过逆距离加权方案进行平均,以预测采样点处的最终解向量。
- 对于每个解变量,都使用聚合网络的一个独立实例,但全局几何编码网络在它们之间是共享的。

DOMINO模型通过这种分解式、多尺度和迭代的方法,能够有效地处理大规模仿真数据,捕捉长距离和短距离的相互作用,并在不牺牲准确性的情况下提供可扩展、准确和可推广的代理模型 。

## 3. 完整代码

``` py linenums="1" title="examples/domino/train.py"
--8<--
examples/domino/train.py
--8<--
```

``` py linenums="1" title="examples/domino/test.py"
--8<--
examples/domino/test.py
--8<--
```

## 4. 结果展示

## 5. 参考资料

- [DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations](https://arxiv.org/abs/2501.13350)
124 changes: 124 additions & 0 deletions examples/domino/conf/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# SPDX-FileCopyrightText: Copyright (c) 2023 - 2024 NVIDIA CORPORATION & AFFILIATES.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy Right需要改为PaddlePaddle,并声明是对原始论文的复现

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# SPDX-FileCopyrightText: All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

project: # Project name
name: AWS_Dataset

seed: 42
exp_tag: 1 # Experiment tag
# Main output directory.
output: outputs/${project.name}/${exp_tag}

hydra: # Hydra config
run:
dir: ${output}
output_subdir: hydra # Default is .hydra which causes files not being uploaded in W&B.

data: # Input directory for training and validation data
input_dir: outputs/volume_data/
input_dir_val: outputs/volume_data/
bounding_box: # Bounding box dimensions for computational domain
min: [-3.5, -2.25 , -0.32]
max: [8.5 , 2.25 , 3.00]
bounding_box_surface: # Bounding box dimensions for car surface
min: [-1.1, -1.2 , -0.32]
max: [4.5 , 1.2 , 1.2]

# The directory to search for checkpoints to continue training.
resume_dir: ${output}/models

variables:
surface:
solution:
# The following is for AWS DrivAer dataset.
pMeanTrim: scalar
wallShearStressMeanTrim: vector
volume:
solution:
# The following is for AWS DrivAer dataset.
UMeanTrim: vector
pMeanTrim: scalar
nutMeanTrim: scalar

model:
model_type: combined # train which model? surface, volume, combined
loss_function: "mse" # mse or rmse
interp_res: [64, 32, 24] # resolution of latent space
use_sdf_in_basis_func: true # SDF in basis function network
positional_encoding: false # calculate positional encoding?
volume_points_sample: 1024 # Number of points to sample in volume per epoch
surface_points_sample: 1024 # Number of points to sample on surface per epoch
geom_points_sample: 2_000 # Number of points to sample on STL per epoch
surface_neighbors: true # Pre-compute surface neighborhood from input data
num_surface_neighbors: 7 # How many neighbors?
use_surface_normals: true # Use surface normals and surface areas for surface computation?
use_only_normals: true # Use only surface normals and not surface area
integral_loss_scaling_factor: 0 # Scale integral loss by this factor
normalization: min_max_scaling # or mean_std_scaling
encode_parameters: true # encode inlet velocity and air density in the model
geometry_rep: # Hyperparameters for geometry representation network
base_filters: 16
geo_conv:
base_neurons: 32 # 256 or 64
base_neurons_out: 1
radius_short: 0.1
radius_long: 0.5 # 1.0, 1.5
hops: 1
geo_processor:
base_filters: 8
geo_processor_sdf:
base_filters: 8
nn_basis_functions: # Hyperparameters for basis function network
base_layer: 512
aggregation_model: # Hyperparameters for aggregation network
base_layer: 512
position_encoder: # Hyperparameters for position encoding network
base_neurons: 512
geometry_local: # Hyperparameters for local geometry extraction
neighbors_in_radius: 64
radius: 0.05 # 0.2 in expt 7
base_layer: 512
parameter_model:
base_layer: 512
scaling_params: [30.0, 1.226] # [inlet_velocity, air_density]

train: # Training configurable parameters
epochs: 50
checkpoint_interval: 1
dataloader:
batch_size: 1
sampler:
shuffle: true
drop_last: false
checkpoint_dir: outputs/AWS_Dataset/3/models/

val: # Validation configurable parameters
dataloader:
batch_size: 1
sampler:
shuffle: true
drop_last: false

eval: # Testing configurable parameters
test_path: drivaer_data_full
save_path: outputs/mesh_predictions_surf_final1/
checkpoint_name: outputs/AWS_Dataset/1/models/DoMINO.0.30.pdparams

data_processor: # Data processor configurable parameters
kind: drivaer_aws # must be either drivesim or drivaer_aws
output_dir: data/volume_data/
input_dir: drivaer_aws/drivaer_data_full/
num_processors: 12
64 changes: 64 additions & 0 deletions examples/domino/download_aws_dataset.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

# This Bash script downloads the AWS DrivAer files from the Amazon S3 bucket to a local directory.
# Only the volume files (.vtu), STL files (.stl), and VTP files (.vtp) are downloaded.
# It uses a function, download_run_files, to check for the existence of three specific files (".vtu", ".stl", ".vtp") in a run directory.
# If a file doesn't exist, it's downloaded from the S3 bucket. If it does exist, the download is skipped.
# The script runs multiple downloads in parallel, both within a single run and across multiple runs.
# It also includes checks to prevent overloading the system by limiting the number of parallel downloads.

# Set the local directory to download the files
LOCAL_DIR="./drivaer_data_full" # <--- This is the directory where the files will be downloaded.

# Set the S3 bucket and prefix
S3_BUCKET="caemldatasets"
S3_PREFIX="drivaer/dataset"

# Create the local directory if it doesn't exist
mkdir -p "$LOCAL_DIR"

# Function to download files for a specific run
download_run_files() {
local i=$1
RUN_DIR="run_$i"
RUN_LOCAL_DIR="$LOCAL_DIR/$RUN_DIR"

# Create the run directory if it doesn't exist
mkdir -p "$RUN_LOCAL_DIR"

# Check if the .vtu file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/volume_$i.vtu" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/volume_$i.vtu" "$RUN_LOCAL_DIR/" &
else
echo "File volume_$i.vtu already exists, skipping download."
fi

# Check if the .stl file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/drivaer_$i.stl" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/drivaer_$i.stl" "$RUN_LOCAL_DIR/" &
else
echo "File drivaer_$i.stl already exists, skipping download."
fi

# Check if the .vtp file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/boundary_$i.vtp" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/boundary_$i.vtp" "$RUN_LOCAL_DIR/" &
else
echo "File boundary_$i.vtp already exists, skipping download."
fi

wait # Ensure that both files for this run are downloaded before moving to the next run
}

# Loop through the run folders and download the files
for i in $(seq 1 500); do
download_run_files "$i" &

# Limit the number of parallel jobs to avoid overloading the system
if (( $(jobs -r | wc -l) >= 8 )); then
wait -n # Wait for the next background job to finish before starting a new one
fi
done

# Wait for all remaining background jobs to finish
wait
Loading