Samudra 2: Scaling Ocean Emulators
across Resolutions

Yuan Yuan1,*, Jesse Rusak2, Alexander Merose2, Adam Subel1, Pavel Perezhogin1, Alistair Adcroft3, Carlos Fernandez-Granda1, Laure Zanna1
1New York University   2Open Athena   3Princeton University
*Corresponding author

TL;DR. Samudra 2 is a neural ocean emulator that scales to 1°, 1/2°, and 1/4° resolution with stable ~8-year autoregressive rollouts.

Gulf Stream surface kinetic energy across resolutions

Surface kinetic energy in the Gulf Stream region from GFDL OM4 (top) and Samudra 2 (bottom) at 1 deg, 1/2 deg, and 1/4 deg resolution near the end of an 8-year autoregressive rollout. At finer grids, the emulator progressively captures mesoscale eddies, meanders, and filamentary western boundary current structure.

Abstract

Ocean general circulation models are essential to climate science but computationally expensive, severely limiting the ensembles and scenarios that can be explored. Neural emulators promise orders-of-magnitude speedups, yet no ocean emulator to date has combined fine spatial resolution with multi-year autoregressive rollouts. Samudra, the first autoregressive neural ocean emulator to produce multi-decade global rollouts, was restricted to 1 deg resolution and exhibited two long-horizon failure modes: variance collapse and imprinting artifacts.

Samudra 2 extends that framework with two complementary modifications: a widened ConvNeXt U-Net backbone with a reduced block-internal expansion factor, and a dynamic loss that reweights per-variable MSE contributions inversely by each channel's running prediction error. This amplifies the gradient signal from slow-evolving deep-ocean fields that standard MSE would otherwise neglect.

At 1 deg, Samudra 2 raises upper-ocean global-mean temperature R2 from 0.56 to 0.87 and reduces deep-ocean temperature error by roughly sevenfold compared to the original Samudra. The same architecture scales to 1/2 deg and 1/4 deg over approximately 8-year autoregressive rollouts, recovering mesoscale eddies and sharp western boundary currents absent at coarser grids.

Core ideas

  Wider ConvNeXt U-Net

Stage widths increase from [200, 250, 300, 400] to [280, 380, 480, 520] and the block-internal expansion factor is reduced from 4 to 2, shifting capacity toward inter-stage features that higher resolutions need.

  Dynamic variance-weighted loss

Per-channel MSE weights are updated online using an exponential moving average of inverse prediction error, amplifying the gradient signal from slow-evolving deep-ocean fields.

  Scaling to higher resolutions

The paper demonstrates multi-year ocean emulation at 1/2 deg and 1/4 deg on GFDL OM4 data, where mesoscale eddies and western boundary currents emerge at eddy-permitting resolution.

Method

The emulator is an autoregressive function gtheta that maps two consecutive ocean states plus atmospheric forcing to the next two states:

(x-hatt+1, x-hatt+2) = gtheta(xt-1, xt, ft-1, ft)

State xt contains four 3D prognostic variables across 19 depth levels: potential temperature, salinity, zonal velocity, and meridional velocity, plus sea surface height. That yields 77 prognostic channels. Training uses short autoregressive rollouts with K = 4 steps, while evaluation runs freely for about 580 steps, roughly 8 years, from 2014 to 2022 with no ground-truth feedback.

Training versus inference rollout diagram

Figure 2. Training with short rollouts versus long-horizon free-running inference.

Results

Quantitative summary at 1 deg

Against the original Samudra at 1 deg resolution, Samudra 2 improves the main long-horizon diagnostics reported in the paper. It tracks the Nino 3.4 index with higher skill (R2 0.93 versus 0.90, RMSE 0.222 °C versus 0.268 °C) and substantially improves detrended global-mean temperature in the upper ocean.

0.56 to 0.87
Upper-ocean R2 (0-700 m)
about 10x
error reduction at 700-2000 m
about 7x
error reduction at 2000-7000 m

The deepest layers remain the hardest regime, but the paper shows that Samudra 2 sharply reduces imprinting artifacts, where velocity-field patterns leak into temperature and salinity at depth.

Rollout demo

The same architecture, trained independently at each resolution, recovers progressively finer dynamical structure. At 1/4 deg, mesoscale eddies, meanders, and sharper western boundary currents become visible in a way that does not come across as well in a single static snapshot.

This video is a better front-page demonstration of the paper's result: long-horizon behavior that remains organized, energetic, and spatially coherent through rollout.

Long-horizon rollout video for Samudra 2.

Why it matters

A multi-year eddy-permitting ocean rollout that would take millions of core-hours on a traditional OGCM completes on a single GPU with Samudra 2. That is roughly two orders of magnitude of speedup, which changes the workflow from running a few scenarios to running hundreds or thousands of ensemble members for sea-level projections, ocean heat uptake, and climate variability studies such as ENSO.

BibTeX

@article{yuan2026samudra2,
  title   = {Samudra 2: Scaling Ocean Emulators Across Resolutions},
  author  = {Yuan, Yuan and Rusak, Jesse and Merose, Alexander and
             Subel, Adam and Perezhogin, Pavel and Adcroft, Alistair and
             Fernandez-Granda, Carlos and Zanna, Laure},
  year    = {2026}
}

Acknowledgements

We thank NVIDIA for a GPU hardware grant and support, Lambda for hardware support, AWS for infrastructure grants, and NYU IT High Performance Computing for resources and staff expertise.