Skip to content

Commit 96eb5ba

Browse files
authored
【Hackathon 8th No.19】Pangu-Weather 论文复现 (#1089)
* Add pangu weather predictor * resolve reviewr issues * [WIP] Add convert data script and docs * update docs * update docs * resolve reviewer issues * fix docs
1 parent 6cc5cd2 commit 96eb5ba

File tree

5 files changed

+459
-0
lines changed

5 files changed

+459
-0
lines changed

docs/zh/examples/pangu_weather.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# Pangu-Weather
2+
3+
=== "模型训练命令"
4+
5+
暂无
6+
7+
=== "模型评估命令"
8+
9+
暂无
10+
11+
=== "模型导出命令"
12+
13+
暂无
14+
15+
=== "模型推理命令"
16+
17+
``` sh
18+
# Download sample input data
19+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/input_surface.npy -P ./data
20+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/input_upper.npy -P ./data
21+
22+
# Download pretrain model weight
23+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_1.onnx -P ./inference
24+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_3.onnx -P ./inference
25+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_6.onnx -P ./inference
26+
wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_24.onnx -P ./inference
27+
28+
# 1h interval-time model inference
29+
python predict.py INFER.export_path=inference/pangu_weather_1
30+
# 3h interval-time model inference
31+
python predict.py INFER.export_path=inference/pangu_weather_3
32+
# 6h interval-time model inference
33+
python predict.py INFER.export_path=inference/pangu_weather_6
34+
# 24h interval-time model inference
35+
python predict.py INFER.export_path=inference/pangu_weather_24
36+
```
37+
38+
## 1. 背景简介
39+
40+
盘古气象大模型(Pangu-Weather)是首个精度超过传统数值预报方法的 AI 方法,其提供了 1 小时间隔、3 小时间隔、6 小时间隔、24 小时间隔的预训练模型。其使用的数据,包括垂直高度上13个不同气压层,每层五种气象要素(温度、湿度、位势、经度和纬度方向的风速),以及地球表面的四种气象要素(2米温度、经度和纬度方向的10米风速、海平面气压)。1 小时 - 7 天预测精度均高于传统数值方法(即欧洲气象中心的 operational IFS)。
41+
42+
同时,盘古气象大模型在一张V100显卡上只需要1.4秒就能完成24小时的全球气象预报,相比传统数值预报提速10000倍以上。
43+
44+
## 2. 模型原理
45+
46+
本章节仅对盘古气象大模型的原理进行简单地介绍,详细的理论推导请阅读 [Pangu-Weather: A 3D High-Resolution System for Fast and Accurate Global Weather Forecast](https://arxiv.org/pdf/2211.02556)
47+
48+
模型的总体结构如图所示:
49+
50+
<figure markdown>
51+
![result](https://paddle-org.bj.bcebos.com/paddlescience/docs/pangu-weather/model_architecture.png){ loading=lazy style="margin:0 auto;"}
52+
<figcaption>模型结构</figcaption>
53+
</figure>
54+
55+
其主要思想是使用一个视觉transformer的3D变种来处理复杂的不均匀的气象要素。由于气象数据分辨率很大,因而相比于常见的vision transformer方法,研究人员将网络的encoder和decoder减少到2级(8个block),同时采用Swin transformer的滑窗注意力机制,以减少网络的计算量
56+
57+
模型使用预训练权重推理,接下来将介绍模型的推理过程。
58+
59+
## 3. 模型构建
60+
61+
在该案例中,实现了 PanguWeatherPredictor用于ONNX模型的推理:
62+
63+
``` py linenums="67" title="examples/pangu_weather/predict.py"
64+
--8<--
65+
examples/pangu_weather/predict.py:67:97
66+
--8<--
67+
```
68+
69+
``` yaml linenums="29" title="examples/pangu_weather/conf/pangu_weather.yaml"
70+
--8<--
71+
examples/pangu_weather/conf/pangu_weather.yaml:29:44
72+
--8<--
73+
```
74+
75+
其中,`input_file``input_surface_file` 分别代表网络模型输入的高空气象数据和地面气象。
76+
77+
## 4. 结果可视化
78+
79+
先将数据从 npy 转换为 NetCDF 格式,然后采用 ncvue 进行可视化
80+
81+
1. 安装相关依赖
82+
```python
83+
pip install cdsapi netCDF4 ncvue
84+
```
85+
86+
2. 使用脚本进行数据转换
87+
```python
88+
python convert_data.py
89+
```
90+
91+
3. 使用 ncvue 打开转换后的 NetCDF 文件, ncvue 具体说明见[ncvue官方文档](https://github.com/mcuntz/ncvue)
92+
93+
## 5. 完整代码
94+
95+
``` py linenums="1" title="examples/pangu_weather/predict.py"
96+
--8<--
97+
examples/pangu_weather/predict.py
98+
--8<--
99+
```
100+
101+
## 6. 结果展示
102+
103+
下图展示了模型的温度预测结果,更多指标可以使用 ncvue 查看。
104+
105+
<figure markdown>
106+
![result](https://paddle-org.bj.bcebos.com/paddlescience/docs/pangu-weather/temperature.png){ loading=lazy style="margin:0 auto;"}
107+
<figcaption>温度预测结果</figcaption>
108+
</figure>
109+
110+
## 7. 参考资料
111+
112+
- [Pangu-Weather: A 3D High-Resolution System for Fast and Accurate Global Weather Forecast](https://arxiv.org/pdf/2211.02556)
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
defaults:
2+
- ppsci_default
3+
- INFER: infer_default
4+
- hydra/job/config/override_dirname/exclude_keys: exclude_keys_default
5+
- _self_
6+
7+
hydra:
8+
run:
9+
# dynamic output directory according to running time and override name
10+
dir: ./outputs_pangu_weather
11+
job:
12+
name: ${mode} # name of logfile
13+
chdir: false # keep current working directory unchanged
14+
callbacks:
15+
init_callback:
16+
_target_: ppsci.utils.callbacks.InitCallback
17+
sweep:
18+
# output directory for multirun
19+
dir: ${hydra.run.dir}
20+
subdir: ./
21+
22+
# general settings
23+
mode: infer # running mode: infer
24+
seed: 2023
25+
output_dir: ${hydra:run.dir}
26+
log_freq: 20
27+
28+
# inference settings
29+
INFER:
30+
pretrained_model_path: null
31+
export_path: inference/pangu_weather_24
32+
onnx_path: ${INFER.export_path}.onnx
33+
device: gpu
34+
engine: onnx
35+
precision: fp32
36+
ir_optim: false
37+
min_subgraph_size: 30
38+
gpu_mem: 100
39+
gpu_id: 0
40+
max_batch_size: 1
41+
num_cpu_threads: 10
42+
batch_size: 1
43+
input_file: './data/input_upper.npy'
44+
input_surface_file: './data/input_surface.npy'
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
2+
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
#
15+
# ref: https://github.com/HaxyMoly/Pangu-Weather-ReadyToGo/blob/main/forecast_decode_functions.py
16+
17+
import os
18+
from os import path as osp
19+
from typing import Dict
20+
21+
import hydra
22+
import netCDF4 as nc
23+
import numpy as np
24+
25+
from ppsci.utils import logger
26+
27+
28+
def convert_surface_data_to_nc(
29+
surface_file: str, file_name: str, output_dir: str
30+
) -> None:
31+
surface_data = np.load(surface_file)
32+
mean_sea_level_pressure = surface_data[0]
33+
u_component_of_wind_10m = surface_data[1]
34+
v_component_of_wind_10m = surface_data[2]
35+
temperature_2m = surface_data[3]
36+
37+
with nc.Dataset(
38+
os.path.join(output_dir, file_name), "w", format="NETCDF4_CLASSIC"
39+
) as nc_file:
40+
# Create dimensions
41+
nc_file.createDimension("longitude", 1440)
42+
nc_file.createDimension("latitude", 721)
43+
44+
# Create variables
45+
nc_lon = nc_file.createVariable("longitude", np.float32, ("longitude",))
46+
nc_lat = nc_file.createVariable("latitude", np.float32, ("latitude",))
47+
nc_msl = nc_file.createVariable(
48+
"mean_sea_level_pressure", np.float32, ("latitude", "longitude")
49+
)
50+
nc_u10 = nc_file.createVariable(
51+
"u_component_of_wind_10m", np.float32, ("latitude", "longitude")
52+
)
53+
nc_v10 = nc_file.createVariable(
54+
"v_component_of_wind_10m", np.float32, ("latitude", "longitude")
55+
)
56+
nc_t2m = nc_file.createVariable(
57+
"temperature_2m", np.float32, ("latitude", "longitude")
58+
)
59+
60+
# Set variable attributes
61+
nc_lon.units = "degrees_east"
62+
nc_lat.units = "degrees_north"
63+
nc_msl.units = "Pa"
64+
nc_u10.units = "m/s"
65+
nc_v10.units = "m/s"
66+
nc_t2m.units = "K"
67+
68+
# Write data to variables
69+
nc_lon[:] = np.linspace(0.125, 359.875, 1440)
70+
nc_lat[:] = np.linspace(90, -90, 721)
71+
nc_msl[:] = mean_sea_level_pressure
72+
nc_u10[:] = u_component_of_wind_10m
73+
nc_v10[:] = v_component_of_wind_10m
74+
nc_t2m[:] = temperature_2m
75+
76+
logger.info(
77+
f"Convert output surface data file {surface_file} as nc format and save to {output_dir}/{file_name}."
78+
)
79+
80+
81+
def convert_upper_data_to_nc(upper_file: str, file_name: str, output_dir: str) -> None:
82+
# Load the saved numpy arrays
83+
upper_data = np.load(upper_file)
84+
geopotential = upper_data[0]
85+
specific_humidity = upper_data[1]
86+
temperature = upper_data[2]
87+
u_component_of_wind = upper_data[3]
88+
v_component_of_wind = upper_data[4]
89+
90+
with nc.Dataset(
91+
os.path.join(output_dir, file_name), "w", format="NETCDF4_CLASSIC"
92+
) as nc_file:
93+
# Create dimensions
94+
nc_file.createDimension("longitude", 1440)
95+
nc_file.createDimension("latitude", 721)
96+
nc_file.createDimension("level", 13)
97+
98+
# Create variables
99+
nc_lon = nc_file.createVariable("longitude", np.float32, ("longitude",))
100+
nc_lat = nc_file.createVariable("latitude", np.float32, ("latitude",))
101+
nc_geopotential = nc_file.createVariable(
102+
"geopotential", np.float32, ("level", "latitude", "longitude")
103+
)
104+
nc_specific_humidity = nc_file.createVariable(
105+
"specific_humidity", np.float32, ("level", "latitude", "longitude")
106+
)
107+
nc_temperature = nc_file.createVariable(
108+
"temperature", np.float32, ("level", "latitude", "longitude")
109+
)
110+
nc_u_component_of_wind = nc_file.createVariable(
111+
"u_component_of_wind", np.float32, ("level", "latitude", "longitude")
112+
)
113+
nc_v_component_of_wind = nc_file.createVariable(
114+
"v_component_of_wind", np.float32, ("level", "latitude", "longitude")
115+
)
116+
117+
# Set variable attributes
118+
nc_lon.units = "degrees_east"
119+
nc_lat.units = "degrees_north"
120+
nc_geopotential.units = "m"
121+
nc_specific_humidity.units = "kg/kg"
122+
nc_temperature.units = "K"
123+
nc_u_component_of_wind.units = "m/s"
124+
nc_v_component_of_wind.units = "m/s"
125+
# Write data to variables
126+
nc_lon[:] = np.linspace(0.125, 359.875, 1440)
127+
nc_lat[:] = np.linspace(90, -90, 721)
128+
nc_geopotential[:] = geopotential
129+
nc_specific_humidity[:] = specific_humidity
130+
nc_temperature[:] = temperature
131+
nc_u_component_of_wind[:] = u_component_of_wind
132+
nc_v_component_of_wind[:] = v_component_of_wind
133+
134+
logger.info(
135+
f"Convert output upper data file {upper_file} as nc format and save to {output_dir}/{file_name}."
136+
)
137+
138+
139+
def convert(cfg: Dict):
140+
output_dir = cfg.output_dir
141+
142+
convert_surface_data_to_nc(
143+
osp.join(output_dir, "output_surface.npy"), "output_surface.nc", output_dir
144+
)
145+
convert_upper_data_to_nc(
146+
osp.join(output_dir, "output_upper.npy"), "output_upper.nc", output_dir
147+
)
148+
149+
150+
@hydra.main(version_base=None, config_path="./conf", config_name="pangu_weather.yaml")
151+
def main(cfg: Dict):
152+
if cfg.mode == "infer":
153+
convert(cfg)
154+
else:
155+
raise ValueError(f"cfg.mode should in ['infer'], but got '{cfg.mode}'")
156+
157+
158+
if __name__ == "__main__":
159+
main()

0 commit comments

Comments
 (0)