Deep Learning Weather Prediction (DLWP) model for weather forecasting#

This example is an implementation of the DLWP Cubed-sphere model. The DLWP model can be used to predict the state of the atmosphere given a previous atmospheric state. You can infer a 320-member ensemble set of six-week forecasts at 1.4° resolution within a couple of minutes, demonstrating the potential of AI in developing near real-time digital twins for weather prediction

Problem overview#

The goal is to train an AI model that can emulate the state of the atmosphere and predict global weather over a certain time span. The Deep Learning Weather Prediction (DLWP) model uses deep CNNs for globally gridded weather prediction. DLWP CNNs directly map u(t) to its future state u(t+Δt) by learning from historical observations of the weather, with Δt set to 6 hr

Model overview and architecture#

DLWP uses convolutional neural networks (CNNs) on a cubed sphere grid to produce global forecasts. The latest DLWP model leverages a U-Net architecture with skip connections to capture multi-scale processes. The model architecture is described in the following papers

Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models

Improving Data-Driven Global Weather Prediction Using Deep Convolutional Neural Networks on a Cubed Sphere

Installation#

Prerequisites#

  1. Install PhysicsNeMo with required extras:

    pip install .[launch]
    
  2. Install additional dependencies:

    pip install -r requirements.txt
    
  3. Install TempestRemap (required for coordinate transformation):

    git clone https://github.com/ClimateGlobalChange/tempestremap
    cd tempestremap
    mkdir build && cd build
    cmake ..
    make
    make install
    

Dataset Preparation#

There are two methods to prepare the training data for DLWP:

Option 2: Quick Start with Minimal Dataset#

For testing or development, you can use the simplified data preparation scripts in the data_curation directory:

  1. Download a minimal set of ERA5 variables:

    cd data_curation
    python data_download_simple.py
    
  2. Process the downloaded data:

    python post_processing.py
    

    See the data_curation/README.md for detailed instructions and parameters.

Data Format#

The final dataset should be organized as follows:

data_dir/
├── train/
│   ├── 1980.h5
│   ├── 1981.h5
│   └── ...
├── test/
│   ├── 2017.h5
│   └── ...
├── out_of_sample/
│   └── 2018.h5
└── stats/
    ├── global_means.npy
    └── global_stds.npy

Each HDF5 file contains:

  • Shape: (time_steps, channels, faces, height, width)

  • Faces: 6 (cubed-sphere)

  • Height/Width: 64 (resolution parameter)

  • Channels: 7 (atmospheric variables)

Training#

To train the model, run:

python train_dlwp.py

Multi-GPU Training#

For distributed training:

mpirun -np <NUM_GPUS> python train_dlwp.py

Note: Add --allow-run-as-root if running in a container as root.

Monitoring Training#

Progress can be monitored using MLFlow:

mlflow ui -p 2458

References#

Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models

Arbitrary-Order Conservative and Consistent Remapping and a Theory of Linear Maps: Part 1

Arbitrary-Order Conservative and Consistent Remapping and a Theory of Linear Maps, Part 2