Tutorial 2: A Lot of Weather Makes Climate - Exploring the ERA5 Reanalysis#

Week 1, Day 2, Ocean-Atmosphere Reanalysis

Content creators: Momme Hell

Content reviewers: Katrina Dobson, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Chi Zhang, Ohad Zivan

Content editors: Jenna Pearson, Chi Zhang, Ohad Zivan

Production editors: Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan

Our 2023 Sponsors: NASA TOPS and Google DeepMind

Tutorial Objectives#

In the previous tutorial, we learned about ENSO, which is a specific atmosphere-ocean dynamical phenomena. You will now examine the atmosphere and the ocean systems more generally.

In this tutorial, you will learn to work with reanalysis data. These data combine observations and models of the Earth system, and are a critical tool for weather and climate science. You will first utilize two methods to access a specific reanalysis dataset (ECMWF’s ERA5; through PO.DAAC and through the web Copernicus API). You will then select and mask a region of interest, investigating how important climate variables change on medium length timescales (hours to months) within this region.

By the end of this tutorial, you will be able to:

Access and select reanalysis data of cliamtically-important variables
Plot maps to explore changes on various time scales.
Compute and compare timeseries of different variables from reanalysis data.

Setup#

# !pip install pythia_datasets
# !pip install cartopy
# !pip install geoviews

# imports
from intake import open_catalog
import matplotlib.pyplot as plt
import matplotlib
import xarray as xr
import fsspec
import numpy as np

import boto3
import botocore
import datetime
import numpy as np
import os
import pooch
import tempfile
import geoviews as gv
import holoviews
from geoviews import Dataset as gvDataset
import geoviews.feature as gf
from geoviews import Image as gvImage

from cartopy import crs as ccrs
from cartopy import feature as cfeature

# import warnings
# #  Suppress warnings issued by Cartopy when downloading data files
# warnings.filterwarnings('ignore')

Figure Settings#

# @title Figure Settings
import ipywidgets as widgets  # interactive display

%config InlineBackend.figure_format = 'retina'
plt.style.use(
    "https://raw.githubusercontent.com/ClimateMatchAcademy/course-content/main/cma.mplstyle"
)

Helper functions#

# @title Helper functions

def pooch_load(filelocation=None, filename=None, processor=None):
    shared_location = "/home/jovyan/shared/Data/tutorials/W1D2_StateoftheClimateOceanandAtmosphereReanalysis"  # this is different for each day
    user_temp_cache = tempfile.gettempdir()

    if os.path.exists(os.path.join(shared_location, filename)):
        file = os.path.join(shared_location, filename)
    else:
        file = pooch.retrieve(
            filelocation,
            known_hash=None,
            fname=os.path.join(user_temp_cache, filename),
            processor=processor,
        )

    return file

Video 1: ECMWF Reanalysis#

Section 2: Plotting Spatial Maps of Reanalysis Data#

First, let’s plot the region’s surface temperature for the first time step of the reanalysis dataset. To do this let’s extract the air temperatre data from the dataset containing all the variables.

ds_surface_temp_2m = ERA5_allvars.air_temperature_at_2_metres

We will be plotting this a little bit differently that you have previously plotted a map (and differently to how you will plot in most tutorials) so we can look at a few times steps interactively later. To do this we are using the package geoviews.

holoviews.extension("bokeh")

dataset_plot = gvDataset(ds_surface_temp_2m.isel(time0=0))  # select the first time step

# create the image
images = dataset_plot.to(
    gvImage, ["longitude", "latitude"], ["air_temperature_at_2_metres"], "hour"
)

# aesthetics, add coastlines etc.
images.opts(
    cmap="coolwarm",
    colorbar=True,
    width=600,
    height=400,
    projection=ccrs.PlateCarree(),
    clabel="2m Air Temperature [K]",
) * gf.coastline

In the above figure, coastlines are shown as black lines. Most of the selected region is land, with some ocean (lower left) and a lake (top middle).

Next, we will examine variability at two different frequencies using interactive plots:

Hourly variability
Daily variability

Note that in the previous tutorial you computed the monthly variability, or climatology, but here you only have one month of data loaded (March 2018). If you are curious about longer timescales you will visit this in the next tutorial!

# interactive plot of hourly frequency of surface temperature
# this cell may take a little longer as it contains several maps in a single plotting function
ds_surface_temp_2m_hour = ds_surface_temp_2m.groupby("time0.hour").mean()
dataset_plot = gvDataset(
    ds_surface_temp_2m_hour.isel(hour=slice(0, 12))
)  # only the first 12 time steps, as it is a time consuming task
images = dataset_plot.to(
    gvImage, ["longitude", "latitude"], ["air_temperature_at_2_metres"], "hour"
)
images.opts(
    cmap="coolwarm",
    colorbar=True,
    width=600,
    height=400,
    projection=ccrs.PlateCarree(),
    clabel="2m Air Temperature [K]",
) * gf.coastline

# interactive plot of hourly frequency of surface temperature
# this cell may take a little longer as it contains several maps in a single plotting function holoviews.extension('bokeh')
ds_surface_temp_2m_day = ds_surface_temp_2m.groupby("time0.day").mean()
dataset_plot = gvDataset(
    ds_surface_temp_2m_day.isel(day=slice(0, 10))
)  # only the first 10 time steps, as it is a time consuming task
images = dataset_plot.to(
    gvImage, ["longitude", "latitude"], ["air_temperature_at_2_metres"], "day"
)
images.opts(
    cmap="coolwarm",
    colorbar=True,
    width=600,
    height=400,
    projection=ccrs.PlateCarree(),
    clabel="Air Temperature [K]",
) * gf.coastline

Question 2#

What differences do you notice between the hourly and daily interactive plots, and are there any interesting spatial patterns of these temperature changes?

# to_remove explanation

"""
1. On hourly timescales, the largest changes are over land because it responds faster than the ocean to the diurnal cycle of solar radiation. This is because the ocean has a higher heat capacity than the land surface. On daily timescales, the surface atmospheric temperature shows comparable variations across both the ocean and land.
""";

Section 3: Plotting Timeseries of Reanalysis Data#

Section 3.1: Surface Air Temperature Timeseries#

You have demonstrated that there are a lot of changes in surface temperature within a day and between days. It is crucial to understand this temporal variability in the data when performing climate analysis.

Rather than plotting interactive spatial maps for different timescales, in this last section you will create a timeseries of surface air temperature from the data you have already examined to look at variability on longer than daily timescales. Instead of taking the mean in time to create maps, you will now take the mean in space to create timeseries.

Note that the spatially-averaged data will now only have a time coordinate coordinate, making it a timeseries (ts).

# find weights (this is a regular grid so we can use cos(lat))
weights = np.cos(np.deg2rad(ds_surface_temp_2m.lat))
weights.name = "weights"

# take the weighted spatial mean since the latitude range of the region of interest is large
ds_surface_temp_2m_ts = ds_surface_temp_2m.weighted(weights).mean(["lon", "lat"])
ds_surface_temp_2m_ts

# plot the timeseries of surface temperature

fig, ax = plt.subplots(1, 1, figsize=(10, 3))

ax.plot(ds_surface_temp_2m_ts.time0, ds_surface_temp_2m_ts)
ax.set_ylabel("2m Air \nTemperature (K)")
ax.xaxis.set_tick_params(rotation=45)

Questions 3.1#

What is the dominant source of the high frequency (short timescale) variability?
What drives the lower frequency variability?
Would the ENSO variablity that you computed in the previous tutorial show up here? Why or why not?

# to_remove explanation
"""
1. The high frequency variability can largely be attributed to the diurnal cycle, or the differences in solar radiation between night and day. This causes large variations in surface temperature, particularly over land and shallow water.
2. The low frequency variability can be attributed to synoptic patterns (e.g., weather) which can move cold or warm air around on timescales of days to weeks.
3. We do not have a long enough time series for ENSO to show up, but ENSO could indirectly affect this timeseries by changing weather patterns on shorter timescales.
""";

Section 3.2: Comparing Timeseries of Multiple Variables#

Below you will calculate the timeseries of the surface air temperature which we just plotted, alongside timeseries of several other ERA5 variables for the same period and region: 10-meter wind speed, atmospheric surface pressure, and sea surface temperature.

ERA5_allvars_ts = ERA5_allvars.weighted(weights).mean(["lon", "lat"])

plot_vars = [
    "air_temperature_at_2_metres",
    "wind_speed",
    "surface_air_pressure",
    "sea_surface_temperature",
]

fig, ax_list = plt.subplots(len(plot_vars), 1, figsize=(10, 13), sharex=True)

for var, ax in zip(plot_vars, ax_list):

    ax.plot(ERA5_allvars_ts.time0, ERA5_allvars_ts[var])
    ax.set_ylabel(
        ERA5_allvars[var].attrs["long_name"] + ": " + ERA5_allvars[var].attrs["units"],
        fontsize=12,
    )
    ax.xaxis.set_tick_params(rotation=45)

Questions 3.2#

Which variable shows variability that is dominated by:

The diurnal cycle?
The synoptic [~5 day] scale?
A mix of these two timescales?
Longer timescales?

# to_remove explanation
"""
1. The 2-meter temperature is dominated by the diurnal cycle.
2. The surface pressure, which is usually associated with storms, is dominated by the synoptic scale.
3. The 10-meter wind speed shows influences from both the diurnal cycle and the synoptic scale.
4. The ocean surface temperature shows some sensitivity to the diurnal cycle but is dominated by longer timescale (>weeks) variations than the atmospheric variables.
""";

Bonus Section 1: Selecting a Different Spatial Region#

Define another spatial region, such as where you live, by selecting a longitude and latitude range of of your choosing. To find the longitude and latitude coordinates of your region, you can use Google Earth view, and read the position of your cursor in the lower right corner.

Bonus Section 1.1: Note About the Geographic Coordinate System and the Coordinates Used in This Dataset#

A point on Earth is described by latitude-longitude coordinates relative to the zero-meridian line going through Greenwich in London, UK (longitude = 0 degree) and the xero-latitude line along the equator (latitude = 0 degrees). Points east of Greenwich up to the dateline on the opposite side of the globe are referenced as 0 to +180 and points to the west of Greenwich are 0 to -180. -180 and +180 refer to the same longitude, the so-called dateline in the central pacific.

However, our data is referenced in a slightly different way where longitude runs from 0 to 360 rather than -180 to +180. Longitude increases as you move east of Greenwich, until you reach Greenwich again (0 or 360 degrees), rather than stopping at the dateline.

Summary#

In this tutorial, you learned how to access and process ERA5 reanalysis data. You are able to select specific regions within the reanalysis dataset and perform operations such as taking spatial and temporal averages.

You also looked at different climate variables to distinguish idenitfy the variability present at different timescales.

Resources#

Data for this tutorial can be accessed here.

Climatematch Academy: Computational Tools for Climate Science

Tutorial 2: A Lot of Weather Makes Climate - Exploring the ERA5 Reanalysis

Contents