Oceanography | Vol. 38, No. 3
78
for more accurate calculations of particulate organic carbon flux.
Ocean biological parameters are globally undersampled relative to
physics and chemistry and are especially lacking for subsurface and
benthic systems. Biological rate measurements (e.g., grazing, pro-
ductivity, viral lysis, and respiration) important for model accuracy
are relatively sparse due to the time and resources required to obtain
them. Rate measurements are further limited by a lack of standard-
ized approaches and poorly constrained discrepancies between in
situ and incubation-based approaches. Co-collection of biological,
chemical, and physical data via the augmentation of existing and
the development of new observing systems is recommended to pro-
vide a more holistic understanding of ocean processes.
Filling observing gaps will require continued progress in devel-
opment and deployment of sensors and platforms that can access
more extreme depths and environments. Sustained investment in
observing infrastructure that transcends disciplines and strategi-
cally combines temporal and spatial (latitude, longitude, depth)
coverage of the ocean is essential in order to address the challenges
that lie before us. This infrastructure will likely include a combi-
nation of repeat hydrography lines (e.g., RAPID array, Extended
Ellett Line), shipboard time-series programs (e.g., Bermuda
Atlantic Time-series Study, Hawaii Ocean Time-series, Porcupine
Abyssal Plain Sustained Observatory, the Global Ocean Ship-based
Hydrographic Investigations Program), Long-Term Ecological
Research stations, long-term monitoring stations (e.g., Ocean
Observatories Initiative and NOAA Ocean Acidification Observing
Network moorings), sentinel sites for extreme events, autonomous
platforms (e.g., floats like BGC Argo, gliders, autonomous sur-
face vehicles), platforms of opportunity (e.g., commercial fish-
ing and cargo ships), and airborne and satellite-based measure-
ments, among others. Observation System Simulation Experiments
(OSSEs) may be useful for coordinating and optimizing observing
system design in order to inform reallocation of resources as scien-
tific grand challenges and priorities change. Improved coordination
and integration of coastal observing assets is especially critical for
monitoring and addressing ongoing threats to human communities
and the marine ecosystem services on which they rely.
Gridded observational BGC data products (e.g., Global Ocean
Data Analysis Project [GLODAP], Surface Ocean CO2 Atlas
[SOCAT], World Ocean Atlas [WOA]) are important tools for sup-
porting ocean research and climate monitoring as well as model
evaluation and development. These products will require contin-
ued advancement of artificial intelligence (AI), machine learning
(ML), and statistical analysis tools to address sampling gaps and to
improve spatial resolution. Additionally, measurements that appear
to be very important in the current generation of models (ammo-
nium and iron) are not currently available as gridded variables.
Cloud-based computing environments (e.g., Pangeo) provide
open-source frameworks that streamline access to standardized
data and model outputs, software, and data analysis tools. They
centralize and democratize access and also facilitate collaboration
and model intercomparison. For example, Model Intercomparison
Projects (MIPs) have become effective community exercises for
assessing model performance and system sensitivity to anthro-
pogenic changes. However, more sophisticated approaches are
needed to evaluate why the models differ from observations and
from each other, and further to guide improvements in how fun-
damental processes are represented. Shared computing environ-
ments allow users to work collaboratively with models produced
by MIPs whose sizes might be prohibitive for personal computers.
Co-development design should be implemented in future
projects/endeavors. Rather than accessing datasets after the com-
pletion of a project, all involved end users must have the opportu-
nity to engage early in the planning stages of a project or process
study to develop a common understanding of data collection prior-
ities, challenges, and opportunities. Models and data assimilation
and analysis tools can inform data collection (e.g., OSSEs), which
can help optimize sampling strategies. Similarly, model-data inte-
gration activities such as data assimilation, which combines model
outputs and observations to improve process understanding, pro-
vide a unique collaboration and capacity building opportunity
to raise awareness of the challenges associated with finding and
aggregating data from multiple sources. Therefore, model reanal-
ysis products with essential ocean BGC variables (Task Team for
the Integrated Framework for Sustained Ocean Observing, 2012)
should also be prioritized, at least at a regional level.
HARMONIZING OCEAN DATA MANAGEMENT
AND SYSTEMS
The success of integrated ocean research depends critically on the
ability to harmonize our approach to ocean data management and
data serving systems. Development of systems and processes that
are findable, accessible, interoperable, and reusable (FAIR) is cen-
tral to this effort. This requires comprehensive approaches to data
collection, documentation, and sharing. Standardized reporting
of observed data and metadata greatly enhances interoperabil-
ity and reusability and will require the development and adoption
of community-vetted reporting guidelines. The use of controlled
vocabularies that are machine-readable (i.e., the Marine Metadata
Interoperability [MMI] Ontology Registry and Repository) and
the adoption of standardized units streamline data aggregation
and ingestion into models. Additionally, requiring quantitative
reporting of quality control and uncertainty measures as part of
metadata would allow scientists to judge whether or not the qual-
ity of a dataset is suitable for their applications.
With numerous data repositories that utilize different data and
metadata practices and formats, finding and aggregating data are
challenging. Continued advancement of semantic approaches like
Resource Description Framework (RDF) that enable a data user
to query across databases, as well as tools like ERDDAP that pro-
vide a consistent application programming interface, or API that
enables data extraction in different formats for a range of appli-
cations, is strongly recommended to maximize return on invest-
ment in data streams and repositories. Transparent provenance