Tutorial 7: Other Computational Tools in Xarray#

Week 1, Day 1, Climate System Overview

Content creators: Sloane Garelick, Julia Kent

Content reviewers: Katrina Dobson, Younkap Nina Duplex, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan

Content editors: Jenna Pearson, Chi Zhang, Ohad Zivan

Production editors: Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan

Our 2023 Sponsors: NASA TOPS and Google DeepMind

#

Pythia credit: Rose, B. E. J., Kent, J., Tyle, K., Clyne, J., Banihirwe, A., Camron, D., May, R., Grover, M., Ford, R. R., Paul, K., Morley, J., Eroglu, O., Kailyn, L., & Zacharias, A. (2023). Pythia Foundations (Version v2023.05.01) https://zenodo.org/record/8065851

#

Tutorial Objectives#

Thus far, we’ve learned about various climate processes in the videos, and we’ve explored tools in Xarray that are useful for analyzing and interpreting climate data.

In this tutorial you’ll continue using the SST data from CESM2 and practice using some additional computational tools in Xarray to resample your data, which can help with data comparison and analysis. The functions you will use are:

resample: Groupby-like functionality specifically for time dimensions. Can be used for temporal upsampling and downsampling. Additional information about resampling in Xarray can be found here.
rolling: Useful for computing aggregations on moving windows of your dataset e.g. computing moving averages. Additional information about resampling in Xarray can be found here.
coarsen: Generic functionality for downsampling data. Additional information about resampling in Xarray can be found here.

Setup#

# imports
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from pythia_datasets import DATASETS
import pandas as pd
import matplotlib.pyplot as plt

Figure Settings#

# @title Figure Settings
import ipywidgets as widgets  # interactive display

%config InlineBackend.figure_format = 'retina'
plt.style.use(
    "https://raw.githubusercontent.com/ClimateMatchAcademy/course-content/main/cma.mplstyle"
)

Video 1: Carbon Cycle and the Greenhouse Effect#

Tutorial slides#

These are the slides for the videos in all tutorials today

Section 1: High-level Computation Functionality#

In this tutorial you will learn about several methods for dealing with the resolution of data. Here are some links for quick reference, and we will go into detail in each of them in the sections below.

First, let’s load the same data that we used in the previous tutorial (monthly SST data from CESM2):

Section 1.1: Resampling Data#

For upsampling or downsampling temporal resolutions, we can use the resample() method in Xarray. For example, you can use this function to downsample a dataset from hourly to 6-hourly resolution.

Our original SST data is monthly resolution. Let’s use resample() to downsample to annual frequency:

# resample from a monthly to an annual frequency
tos_yearly = ds.tos.resample(time="AS")
tos_yearly

DataArrayResample, grouped over '__resample_dim__'
15 groups with labels 2000-01-01, 00:00:00, ..., 201....

# calculate the global mean of the resampled data
annual_mean = tos_yearly.mean()
annual_mean_global = annual_mean.mean(dim=["lat", "lon"])
annual_mean_global.plot()

[<matplotlib.lines.Line2D at 0x7f7079fe2050>]

../../../_images/W1D1_Tutorial7_17_1.png

Section 1.4: Compare the Resampling Methods#

Now that we’ve tried multiple resampling methods on different temporal resolutions, we can compare the resampled datasets to the original.

original_global = ds.mean(dim=["lat", "lon"])

original_global.tos.plot(size=6)
coarse_data.tos.plot()
tos_m_avg_global.plot()
annual_mean_global.plot()


plt.legend(
    [
        "original data (monthly)",
        "coarsened (4 months)",
        "moving average (6 months)",
        "annually resampled (12 months)",
    ]
)

<matplotlib.legend.Legend at 0x7f707663a4d0>

../../../_images/W1D1_Tutorial7_26_1.png

Questions 1.4: Climate Connection#

What type of information can you obtain from each time series?
In what scenarios would you use different temporal resolutions?

# to_remove explanation

"""
1. In general, by examining the data at these different time scales, you can get a more comprehensive understanding of the SST variations and their potential causes.
2. The original monthly data gives you the most granular view of the data, allowing you to see monthly variations in SST. Coarsening the data over 4-month periods reduces the temporal resolution but provides a slightly smoothed series that could help identify patterns or trends over this larger time period. A 6-month moving average could be useful for identifying semi-annual trends and reducing the impact of short-term noise in the data. The annually resampled (12 months) data provides a high-level view of the SST data, emphasizing the annual pattern. This can be useful for identifying long-term trends or changes in the data over the span of years.
""";

Summary#

In this tutorial, we’ve explored Xarray tools to simplify and understand climate data better. Given the complexity and variability of climate data, tools like resample, rolling, and coarsen come in handy to make the data easier to compare and find long-term trends. You’ve also looked at valuable techniques like calculating moving averages.

Resources#

Code and data for this tutorial is based on existing content from Project Pythia.

Climatematch Academy: Computational Tools for Climate Science

Tutorial 7: Other Computational Tools in Xarray

Contents