Pangu Weather for Global Weather Forecasting
A re-implementation of Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast in PhysicsNeMo.
Pangu-Weather is a transformer-based model that provides global weather forecasts at 0.25° resolution. The model uses a unique architecture that processes both surface-level and upper-air variables, along with static geographical information (land-sea mask, topography, and soil type). It generates predictions for multiple atmospheric variables at both surface level and pressure levels.
The model requires a specific set of ERA5 variables organized into three components:
Surface variables (4 channels)
Upper-air variables (5 variables × 13 pressure levels = 65 channels)
Static geographical masks (3 channels)
Download using ERA5 Downloader
First, ensure you have set up your CDS API key as described in the
dataset_download
README.Use a configuration file to specify the variables to download (user-defined):
python dataset_download/start_mirror.py --config-name="config_pangu.yaml"
The downloaded data will be organized as follows:
├── data_dir
├── train/
│ ├── 1980.h5
│ ├── 1981.h5
│ └── ...
├── test/
│ ├── 2017.h5
│ └── ...
├── out_of_sample/
│ └── 2018.h5
└── stats/
├── global_means.npy
└── global_stds.npy
Each HDF5 file contains:
Data shape: (time_steps, channels, latitude, longitude)
Latitude: 721 points (-90° to 90°)
Longitude: 1440 points (-180° to 180°)
Channels: Surface (4) + Upper-air (65) variables
Required Variables
Surface Variables (4 channels):
2m temperature
10m u-component of wind
10m v-component of wind
Mean sea level pressure
Upper-air Variables (5 variables × 13 pressure levels):
Temperature
U component of wind
V component of wind
Specific humidity
Geopotential
Static Masks (3 channels):
Land-sea mask
Soil type
Topography
Install PhysicsNeMo with required extras:
# If installing from the PhysicsNeMo repository pip install .[launch]
Install additional dependencies:
pip install -r requirements.txt
Install NVIDIA Apex (required for optimizer):
git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation \ --config-settings "--build-option=--cpp_ext" \ --config-settings "--build-option=--cuda_ext" ./
Two training scripts are provided:
train_pangu_era5.py
: Full Pangu-Weather implementationtrain_pangu_lite_era5.py
: Lightweight version for testing
To train the model on a single GPU:
# Full version
python train_pangu_era5.py
# Lite version
python train_pangu_lite_era5.py
Multi-GPU Training
Data parallelism is supported with multi-GPU runs:
mpirun -np <num_GPUs> python train_pangu_era5.py
If running inside a docker container, add the --allow-run-as-root
flag.
Monitoring Training Progress
Training progress can be monitored using MLFlow:
mlflow ui -p 2458
View progress in a browser at http://127.0.0.1:2458
@article{bi2023pangu,
title={Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast},
author={Bi, Kaifeng and Xie, Lingxi and Zhang, Hengheng and others},
journal={arXiv preprint arXiv:2211.02556},
year={2023}
}