September 2025

Oceanography | Vol. 38, No. 3

whereas the average spacing between cutting-edge OGCM grid

cells is 1 km. In this sense, cutting-edge OGCMs are becoming

unconstrained by data because the data are sparse compared to

the OGCM degrees of freedom (and notice that this is not true for

the ocean components of cutting-edge IPCC models). The unequal

growth of OGCM resolution and data density reflects the so-called

maturation of computational oceanography (Haine et al., 2021).

Cutting-edge OGCMs are thus becoming more and more valuable

as a resource in oceanography.

OGCM SOLUTIONS AND DATA ACCESS

LLC4320

For example, the 2016 black dot in Figure 1 is a model solution

called LLC4320 (the name refers to the latitude-longitude-cap

horizontal grid with 4320 × 4320 grid cells in each of 13 faces

that tile the global ocean; Rocha et al., 2016; Arbic et al., 2018).

The LLC4320 simulation provides hourly output for one year

in 2011–2012 using the Massachusetts Institute of Technology

OGCM code. A few similar solutions exist using other circula-

tion models and different configurations. Collectively, such solu-

tions are called “nature runs” or “digital twins” of the ocean cur-

rents (Boyes and Watson, 2022; Chen et al., 2023; NASEM, 2024;

Vance et al., 2024). They are useful for many purposes that include

understanding ocean dynamics, designing observing systems,

and machine learning.

Indeed, the oceanographic community is eagerly adopting

these cutting-edge OGCM solutions. To illustrate, the red dots

in Figure 2 show the number of papers each year that utilize the

LLC4320 solution. As in Figure 1, the y-axis of Figure 2 is loga-

rithmic, and straight lines indicate exponential growth. Thus,

Figure 2 shows that the number of LLC4320 papers per year has

grown roughly as an exponential with a doubling time of around

3 yr; dozens of papers now employ the LLC4320 simulation per year.

Despite this growing popularity, the data from LLC4320-type

cutting edge simulations are very challenging to use. The main

problem is the massive size of the datasets, which means that

access to these data is difficult and time-consuming. For LLC4320,

the total uncompressed data volume is four petabytes (one peta-

byte is 1015 bytes), and it takes many months to obtain accounts

on the NASA supercomputers where the LLC4320 simulation

was run. Moreover, the datasets are far too massive for individual

researchers to download and analyze personal copies.

POSEIDON PROJECT

Making the LLC4320 (and similar) simulation data easy to use is

therefore an important priority. Evidence from a neighboring field

in fluid mechanics shows the benefits of opening massive simula-

tion datasets to easy community access. Specifically, the blue dots

in Figure 2 show the number of papers each year that utilize the

Johns Hopkins Turbulence Database (JHTDB; Li et al., 2008). The

JHTDB is an open numerical turbulence laboratory that provides

free access to benchmark numerical solutions for various canonical

turbulence problems. Figure 2 shows that the number of JHTDB

papers per year has also grown exponentially, with a doubling time

of 3.0 yr. In total, more than 6 × 1014 individual model grid cells

have been queried using the JHTDB. A recent paper states that

FIGURE 1. Growth over time of the number of

horizontal grid cells in global ocean general cir-

culation models (OGCMs, see the black dots), the

number of horizontal grid cells in the global cou-

pled climate model from the Intergovernmental

Panel on Climate Change (IPCC, see the colored

dots), and the number per year of deep (greater

than 1,000 m depth) CTD stations. Note that the

y-axis is logarithmic and the straight red lines

indicate exponential growth (the doubling times,

τ2× are shown). The black dot in 2016 is for the

LLC4320 OGCM (see text and Figures 2 and 3).

The three-letter abbreviations in color refer to

the IPCC assessment reports. Modified from

Figure 2 in Haine et al. (2021)

2.5 yr

τ2× =

3.7 yr

τ2×

IPCC model

horizontal

grid cell #

Deep CTD

stations

per year

OGCM horizontal

grid cell #

SAR

FAR

AR4

AR5

AR6

TAR

109

108

107

106

105

104

103

103

104

105

Horizontal Grid Scale (m)

1980

1990

2000

2010

Number

2020