Open In Colab   Open in Kaggle

Tutorial 9: Masking with Multiple Conditions#

Week 1, Day 1, Climate System Overview

Content creators: Sloane Garelick, Julia Kent

Content reviewers: Katrina Dobson, Younkap Nina Duplex, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan

Content editors: Jenna Pearson, Chi Zhang, Ohad Zivan

Production editors: Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan

Our 2023 Sponsors: NASA TOPS and Google DeepMind

project pythia#

Pythia credit: Rose, B. E. J., Kent, J., Tyle, K., Clyne, J., Banihirwe, A., Camron, D., May, R., Grover, M., Ford, R. R., Paul, K., Morley, J., Eroglu, O., Kailyn, L., & Zacharias, A. (2023). Pythia Foundations (Version v2023.05.01) https://zenodo.org/record/8065851

CMIP.png#

Tutorial Objectives#

In the previous tutorial, you masked data using one condition (areas where SST was above 0ºC). You can also mask data using multiple conditions. For example, you can mask data from regions outside a certain spatial area by providing constraints on the latitude and longitude.

In this tutorial, you will practice masking data using multiple conditions in order to interpret SST in the tropical Pacific Ocean in the context of the El Niño Southern Oscillation (ENSO).

Setup#

# imports
import numpy as np
import xarray as xr
from pythia_datasets import DATASETS
import pandas as pd
import matplotlib.pyplot as plt

Figure Settings#

# @title Figure Settings
import ipywidgets as widgets  # interactive display

%config InlineBackend.figure_format = 'retina'
plt.style.use(
    "https://raw.githubusercontent.com/ClimateMatchAcademy/course-content/main/cma.mplstyle"
)

Video 1: Past, Present, and Future Climate#

Tutorial slides#

These are the slides for the videos in all tutorials today

Section 1: Using .where() with multiple conditions#

First, let’s load the same data that we used in the previous tutorial (monthly SST data from CESM2):

filepath = DATASETS.fetch("CESM2_sst_data.nc")
ds = xr.open_dataset(filepath)
ds
/home/wesley/miniconda3/envs/climatematch/lib/python3.10/site-packages/xarray/conventions.py:431: SerializationWarning: variable 'tos' has multiple fill values {1e+20, 1e+20}, decoding all values to NaN.
  new_vars[k] = decode_cf_variable(
<xarray.Dataset>
Dimensions:    (time: 180, d2: 2, lat: 180, lon: 360)
Coordinates:
  * time       (time) object 2000-01-15 12:00:00 ... 2014-12-15 12:00:00
  * lat        (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
  * lon        (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
Dimensions without coordinates: d2
Data variables:
    time_bnds  (time, d2) object ...
    lat_bnds   (lat, d2) float64 ...
    lon_bnds   (lon, d2) float64 ...
    tos        (time, lat, lon) float32 ...
Attributes: (12/45)
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            CMIP
    branch_method:          standard
    branch_time_in_child:   674885.0
    branch_time_in_parent:  219000.0
    case_id:                972
    ...                     ...
    sub_experiment_id:      none
    table_id:               Omon
    tracking_id:            hdl:21.14100/2975ffd3-1d7b-47e3-961a-33f212ea4eb2
    variable_id:            tos
    variant_info:           CMIP6 20th century experiments (1850-2014) with C...
    variant_label:          r11i1p1f1

.where() allows us to mask using multiple conditions. To do this, we need to make sure each conditional expression is enclosed in (). To combine conditions, we use the bit-wise and (&) operator and/or the bit-wise or (|). Let’s use .where() to isolate locations with temperature values greater than 25 and less than 30:

# take the last time step as our data
sample = ds.tos.isel(time=-1)

# just keep data between 25-30 C
sample.where((sample > 25) & (sample < 30)).plot(size=6)
<matplotlib.collections.QuadMesh at 0x7effaccf8160>
../../../_images/W1D1_Tutorial9_17_1.png

Section 2: Using .where() with a Custom Fill Value#

.where() can take a second argument, which, if supplied, defines a fill value for the masked region. Below we fill masked regions with a constant 0:

sample.where((sample > 25) & (sample < 30), 0).plot(size=6)
<matplotlib.collections.QuadMesh at 0x7effaaa4e140>
../../../_images/W1D1_Tutorial9_20_1.png

Section 3: Using .where() with Specific Coordinates#

We can use coordinates to apply a mask as well. For example, we can use a mask to assess tropical Pacific SST associated with the El Niño Southern Oscillation (ENSO). As we learned in the video, ENSO is a climate phenomena that originates in the tropical Pacific ocean but has global impacts on atmospheric circulation, temperature and precipitation. The two phases of ENSO are El Niño (warmer than average SSTs in the central and eastern tropical Pacific Ocean) and La Niña (cooler than average SSTs in the central and eastern tropical Pacific Ocean). The Niño 3.4 region is an area in the centeral and eastern Pacific Ocean that is often used for determining the phase of ENSO. Below, we will use the latitude and longitude coordinates to mask everywhere outside of the Niño 3.4 region. Note in our data that we are in degrees East, so the values we input for longitude will be shifted compared to the figure below.

# input the conditions for the latitude and longitude values we wish to preserve
sample.where(
    (sample.lat < 5) & (sample.lat > -5) & (sample.lon > 190) & (sample.lon < 240)
).plot(size=6)
<matplotlib.collections.QuadMesh at 0x7effaa937a90>
../../../_images/W1D1_Tutorial9_23_1.png

Now let’s look at a time series of the data from this masked region. Rather than specifying a certain time period, we can mask all areas outside of the Niño 3.4 region and then take the spatial mean to assess changes in Niño 3.4 SST over this time period.

nino = ds.tos.where(
    (sample.lat < 5) & (sample.lat > -5) & (sample.lon > 190) & (sample.lon < 240)
)

nino_mean = ds.tos.mean(dim=["lat", "lon"])
nino_mean
<xarray.DataArray 'tos' (time: 180)>
array([14.209291 , 14.301911 , 14.214222 , 14.105894 , 14.005788 ,
       14.019834 , 14.077742 , 14.13642  , 14.069463 , 13.929279 ,
       13.897242 , 14.054486 , 14.260762 , 14.336    , 14.282292 ,
       14.175469 , 14.108064 , 14.149008 , 14.241707 , 14.3227625,
       14.2158785, 14.095403 , 14.062266 , 14.211877 , 14.413039 ,
       14.491661 , 14.417534 , 14.269154 , 14.168515 , 14.139109 ,
       14.168024 , 14.194587 , 14.093594 , 13.930419 , 13.867279 ,
       14.030324 , 14.189877 , 14.290011 , 14.236619 , 14.161484 ,
       14.097381 , 14.108655 , 14.139499 , 14.186226 , 14.093977 ,
       13.981357 , 13.973224 , 14.097    , 14.292091 , 14.400722 ,
       14.345691 , 14.20365  , 14.125706 , 14.152941 , 14.218176 ,
       14.247176 , 14.150201 , 14.008736 , 13.97948  , 14.132178 ,
       14.400193 , 14.510438 , 14.42778  , 14.255584 , 14.13967  ,
       14.141772 , 14.2131405, 14.253482 , 14.118695 , 13.933494 ,
       13.887242 , 14.048874 , 14.301898 , 14.37586  , 14.34444  ,
       14.200206 , 14.116828 , 14.128504 , 14.223874 , 14.305338 ,
       14.171358 , 14.001766 , 13.975079 , 14.082166 , 14.326112 ,
       14.40438  , 14.354157 , 14.193055 , 14.123794 , 14.187033 ,
       14.265439 , 14.362049 , 14.230915 , 14.046692 , 13.953421 ,
       14.11934  , 14.350437 , 14.464493 , 14.389748 , 14.25258  ,
       14.209855 , 14.255964 , 14.338332 , 14.425071 , 14.347628 ,
       14.160796 , 14.094667 , 14.24654  , 14.46593  , 14.560417 ,
       14.494721 , 14.342759 , 14.247713 , 14.264827 , 14.307555 ,
       14.348459 , 14.233938 , 14.022779 , 13.930207 , 14.017997 ,
       14.233578 , 14.387215 , 14.350467 , 14.208364 , 14.143466 ,
       14.179803 , 14.2689905, 14.337748 , 14.208286 , 14.025201 ,
       13.990665 , 14.069036 , 14.294299 , 14.398544 , 14.358639 ,
       14.289998 , 14.232027 , 14.2498   , 14.357119 , 14.418483 ,
       14.33024  , 14.175936 , 14.1619215, 14.337633 , 14.576431 ,
       14.68036  , 14.590141 , 14.43527  , 14.360472 , 14.381661 ,
       14.455142 , 14.4893055, 14.396512 , 14.267952 , 14.261142 ,
       14.390419 , 14.586492 , 14.733391 , 14.663939 , 14.504151 ,
       14.400374 , 14.40027  , 14.460021 , 14.517276 , 14.364817 ,
       14.202337 , 14.170254 , 14.311132 , 14.531329 , 14.6739   ,
       14.587329 , 14.403834 , 14.3262615, 14.3584585, 14.442886 ,
       14.505671 , 14.396822 , 14.27391  , 14.220645 , 14.327209 ],
      dtype=float32)
Coordinates:
  * time     (time) object 2000-01-15 12:00:00 ... 2014-12-15 12:00:00
nino_mean.plot()
[<matplotlib.lines.Line2D at 0x7effaa84bf40>]
../../../_images/W1D1_Tutorial9_26_1.png

Questions 3: Climate Connection#

  1. What patterns (e.g. cycles, trends) do you observe in this SST time series for the Niño 3.4 region?

  2. What do you think might be responsible for the patterns you observe? What about any trends?

  3. Notice that we did not use a weighted mean. Do you think the results would be very different if we did weight the mean?

# to_remove explanation

"""
1. The SST time series indicates an increase from 2000 to 2014 for the Niño 3.4 region. On top of this increase we see fluctuations seasonally and interannually (year-to year).
2. The seasonal variability could be related to seasonally varying insolation, overlying wind patterns, and precipitation, while the interannual variability could be tied to ENSO events.  The long-term trend of increasing SSTs could be attributed to human-induced global warming or decadal variability as our time series is quite short.
3. In the equatorial region, the grid cells are usually very similar in size. You could compute the weighted average and compare to be sure this is the case with our reanalysis product.

These questions are just to get you thinking....note from this time series alone we can not attribute processes to the observed SST.
""";

Summary#

  • Similar to NumPy, arithmetic operations are vectorized over a DataArray

  • Xarray provides aggregation methods like sum() and mean(), with the option to specify which dimension over which the operation will be done

  • groupby enables the convenient split-apply-combine workflow

  • The .where() method allows for filtering or replacing of data based on one or more provided conditions

Resources#

Code and data for this tutorial is based on existing content from Project Pythia.