{ "cells": [ { "cell_type": "markdown", "id": "e5bddb1b", "metadata": { "execution": {} }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ClimateMatchAcademy/course-content/blob/main/tutorials/W1D1_ClimateSystemOverview/W1D1_Tutorial5.ipynb)   \"Open" ] }, { "cell_type": "markdown", "id": "BSjO7xX42sEH", "metadata": { "execution": {} }, "source": [ "# Tutorial 5: Xarray Data Analysis and Climatology\n", "\n", "**Week 1, Day 1, Climate System Overview**\n", "\n", "**Content creators:** Sloane Garelick, Julia Kent\n", "\n", "**Content reviewers:** Katrina Dobson, Younkap Nina Duplex, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan\n", "\n", "**Content editors:** Jenna Pearson, Chi Zhang, Ohad Zivan\n", "\n", "**Production editors:** Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan\n", "\n", "**Our 2023 Sponsors:** NASA TOPS and Google DeepMind" ] }, { "cell_type": "markdown", "id": "e90a481e-8dd8-4d05-a5a1-a612f89cd637", "metadata": { "execution": {} }, "source": [ "## ![project pythia](https://projectpythia.org/_static/images/logos/pythia_logo-blue-rtext.svg)\n", "\n", "Pythia credit: Rose, B. E. J., Kent, J., Tyle, K., Clyne, J., Banihirwe, A., Camron, D., May, R., Grover, M., Ford, R. R., Paul, K., Morley, J., Eroglu, O., Kailyn, L., & Zacharias, A. (2023). Pythia Foundations (Version v2023.05.01) https://zenodo.org/record/8065851\n", "\n", "## ![CMIP.png](https://github.com/ClimateMatchAcademy/course-content/blob/main/tutorials/Art/CMIP.png?raw=true)\n" ] }, { "cell_type": "markdown", "id": "z99xmBTDi3JS", "metadata": { "execution": {} }, "source": [ "# Tutorial Objectives\n", "\n", "Global climate can vary on long timescales, but it's also important to understand seasonal variations. For example, seasonal variations in precipitation associated with the migration of the [Intertropical Convergence Zone (ITCZ)](https://glossary.ametsoc.org/wiki/Intertropical_convergence_zone#:~:text=(Also%20called%20ITCZ%2C%20equatorial%20convergence,and%20Northern%20Hemispheres%2C%20respectively).) and monsoon systems occur in response to seasonal changes in temperature. In this tutorial, we will use data analysis tools in Xarray to explore the seasonal climatology of global temperature. Specifically, in this tutorial, we'll use the `groupby` operation in Xarray, which involves the following steps:\n", "\n", "- **Split**: group data by value (e.g., month).\n", "- **Apply**: compute some function (e.g., aggregate) within the individual groups.\n", "- **Combine**: merge the results of these operations into an output dataset." ] }, { "cell_type": "markdown", "id": "0af7bee1-3de3-453a-8ae8-bcd7910b4266", "metadata": { "execution": {}, "tags": [] }, "source": [ "# Setup\n" ] }, { "cell_type": "code", "execution_count": null, "id": "06073287-7bdb-45b5-9cec-8cdf123adb49", "metadata": { "execution": {}, "executionInfo": { "elapsed": 2358, "status": "ok", "timestamp": 1681572562093, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "# imports\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import xarray as xr\n", "from pythia_datasets import DATASETS\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Figure Settings\n" ] }, { "cell_type": "code", "execution_count": null, "id": "72e04965-e982-444d-b3da-4e1e639c6899", "metadata": { "cellView": "form", "execution": {}, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "# @title Figure Settings\n", "import ipywidgets as widgets # interactive display\n", "\n", "%config InlineBackend.figure_format = 'retina'\n", "plt.style.use(\n", " \"https://raw.githubusercontent.com/ClimateMatchAcademy/course-content/main/cma.mplstyle\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Video 1: Terrestrial Temperature and Rainfall\n" ] }, { "cell_type": "code", "execution_count": null, "id": "21725d4b-ee68-42aa-af76-70392b4ab6ac", "metadata": { "cellView": "form", "execution": {}, "tags": [ "remove-input" ] }, "outputs": [], "source": [ "# @title Video 1: Terrestrial Temperature and Rainfall\n", "\n", "from ipywidgets import widgets\n", "from IPython.display import YouTubeVideo\n", "from IPython.display import IFrame\n", "from IPython.display import display\n", "\n", "\n", "class PlayVideo(IFrame):\n", " def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n", " self.id = id\n", " if source == 'Bilibili':\n", " src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n", " elif source == 'Osf':\n", " src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n", " super(PlayVideo, self).__init__(src, width, height, **kwargs)\n", "\n", "\n", "def display_videos(video_ids, W=400, H=300, fs=1):\n", " tab_contents = []\n", " for i, video_id in enumerate(video_ids):\n", " out = widgets.Output()\n", " with out:\n", " if video_ids[i][0] == 'Youtube':\n", " video = YouTubeVideo(id=video_ids[i][1], width=W,\n", " height=H, fs=fs, rel=0)\n", " print(f'Video available at https://youtube.com/watch?v={video.id}')\n", " else:\n", " video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n", " height=H, fs=fs, autoplay=False)\n", " if video_ids[i][0] == 'Bilibili':\n", " print(f'Video available at https://www.bilibili.com/video/{video.id}')\n", " elif video_ids[i][0] == 'Osf':\n", " print(f'Video available at https://osf.io/{video.id}')\n", " display(video)\n", " tab_contents.append(out)\n", " return tab_contents\n", "\n", "\n", "video_ids = [('Youtube', 'SyvFyT3jVM8'), ('Bilibili', 'BV1ho4y1C7Eo')]\n", "tab_contents = display_videos(video_ids, W=730, H=410)\n", "tabs = widgets.Tab()\n", "tabs.children = tab_contents\n", "for i in range(len(tab_contents)):\n", " tabs.set_title(i, video_ids[i][0])\n", "display(tabs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tutorial slides\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " These are the slides for the videos in all tutorials today\n" ] }, { "cell_type": "code", "execution_count": null, "id": "1bfcf3bd-6805-4cfb-90c4-a9067f6ce91c", "metadata": { "cellView": "form", "execution": {}, "pycharm": { "name": "#%%\n" }, "tags": [ "remove-input" ] }, "outputs": [], "source": [ "# @title Tutorial slides\n", "# @markdown These are the slides for the videos in all tutorials today\n", "from IPython.display import IFrame\n", "link_id = \"9z6km\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8704803f-300d-4631-a2fa-f62d18726d1c", "metadata": { "execution": {}, "tags": [] }, "source": [ "# Section 1: GroupBy: Split, Apply, Combine\n", "\n", "Simple aggregations (as we learned in the previous tutorial) can give useful summary of our dataset, but often we would prefer to aggregate conditionally on some coordinate labels or groups. Xarray provides the so-called `groupby` operation which enables the **split-apply-combine** workflow on Xarray DataArrays and Datasets. The split-apply-combine operation is illustrated in this figure from [Project Pythia](https://foundations.projectpythia.org/core/xarray/computation-masking.html):\n", "\n", "\"split-apply-combine\"\n", "\n", "- The **split** step involves breaking up and grouping an xarray Dataset or DataArray depending on the value of the specified group key.\n", "- The **apply** step involves computing some function, usually an aggregate, transformation, or filtering, within the individual groups.\n", "- The **combine** step merges the results of these operations into an output xarray Dataset or DataArray.\n", "\n", "We are going to use `groupby` to remove the seasonal cycle (\"climatology\") from our dataset, which will allow us to better observe long-term trends in the data. See the [xarray `groupby` user guide](https://xarray.pydata.org/en/stable/user-guide/groupby.html) for more examples of what `groupby` can take as an input." ] }, { "cell_type": "markdown", "id": "9719db5b-e645-4815-b8df-d454fa7703e7", "metadata": { "execution": {} }, "source": [ "Let's start by loading the same data that we used in the previous tutorial (monthly SST data from CESM2):" ] }, { "cell_type": "code", "execution_count": null, "id": "7837f8bd-da89-4718-ab02-d5107576d2d6", "metadata": { "execution": {}, "executionInfo": { "elapsed": 388, "status": "ok", "timestamp": 1681573026385, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "filepath = DATASETS.fetch(\"CESM2_sst_data.nc\")\n", "ds = xr.open_dataset(filepath)\n", "ds" ] }, { "cell_type": "markdown", "id": "713cc8d8-7374-4c5b-be61-aec4b5b0ffe6", "metadata": { "execution": {} }, "source": [ "Then, let's select a gridpoint closest to a specified lat-lon (in this case let's select 50ºN, 310ºE), and plot a time series of SST at that point (recall that we learned this is Tutorial 2). The annual cycle will be quite pronounced. Note that we are using the `nearest` method (see Tutorial 2 for a refresher) to find the points in our datasets closest to the lat-lon values we specify. What this returns may not match these inputs exactly." ] }, { "cell_type": "code", "execution_count": null, "id": "c0348ee8-6e9b-4f50-a844-375ae00d2771", "metadata": { "execution": {}, "executionInfo": { "elapsed": 959, "status": "ok", "timestamp": 1681573031714, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "ds.tos.sel(\n", " lon=310, lat=50, method=\"nearest\"\n", ").plot() # time range is 2000-01-15 to 2014-12-15" ] }, { "cell_type": "markdown", "id": "e732cd9b", "metadata": { "execution": {} }, "source": [ "This plot is showing changes in monthly SST between 2000-01-15 to 2014-12-15. The annual cycle of SST change is apparent in this figure, but to understand the climatatology of this region, we need to calculate the average SST for each month over this time period. The first step is to split the data into groups based on month." ] }, { "cell_type": "markdown", "id": "d1505625-cbcd-495b-a15f-8824e455415b", "metadata": { "execution": {} }, "source": [ "## Section 1.1: Split\n", "\n", "Let's group data by month, i.e. all Januaries in one group, all Februaries in one group, etc.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "6e4fb25e-165f-4350-a93d-46a344f2d175", "metadata": { "execution": {}, "executionInfo": { "elapsed": 160, "status": "ok", "timestamp": 1681572674597, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "ds.tos.groupby(ds.time.dt.month)" ] }, { "cell_type": "markdown", "id": "5d176ad8-15f1-4ecc-ab3e-898cef3b4e18", "metadata": { "execution": {} }, "source": [ "
\n", "\n", "In the above code, we are using the `.dt` [`DatetimeAccessor`](https://xarray.pydata.org/en/stable/generated/xarray.core.accessor_dt.DatetimeAccessor.html) to extract specific components of dates/times in our time coordinate dimension. For example, we can extract the year with `ds.time.dt.year`. See also the equivalent [Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dt.html).\n", " \n", "
" ] }, { "cell_type": "markdown", "id": "ad273652-178c-4eda-80b6-6d39a11d6f1e", "metadata": { "execution": {} }, "source": [ "Xarray also offers a more concise syntax when the variable you’re grouping on is already present in the dataset. This is identical to `ds.tos.groupby(ds.time.dt.month)`:" ] }, { "cell_type": "code", "execution_count": null, "id": "c6990393-fb5f-4a10-b8e2-fd9c6917d9d2", "metadata": { "execution": {}, "executionInfo": { "elapsed": 215, "status": "ok", "timestamp": 1681572677175, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "ds.tos.groupby(\"time.month\")" ] }, { "cell_type": "markdown", "id": "6b85dbf7-daf1-4889-8b3b-6991d290969f", "metadata": { "execution": {}, "tags": [] }, "source": [ "## Section 1.2: Apply & Combine\n", "\n", "Now that we have groups defined, it’s time to “apply” a calculation to the group. These calculations can either be:\n", "\n", "- aggregation: reduces the size of the group\n", "- transformation: preserves the group’s full size\n", "\n", "At then end of the apply step, xarray will automatically combine the aggregated/transformed groups back into a single object. \n", "\n", "\n", "\n", "### Section 1.2.1: Compute the Climatology\n", "\n", "\n", "Let's calculate the climatology at every point in the dataset. To do so, we will use aggregation and will calculate the mean SST for each month:\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7e568c2f-7143-4346-85ce-a430db03316e", "metadata": { "execution": {}, "executionInfo": { "elapsed": 2695, "status": "ok", "timestamp": 1681572682188, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 }, "tags": [] }, "outputs": [], "source": [ "tos_clim = ds.tos.groupby(\"time.month\").mean()\n", "tos_clim" ] }, { "cell_type": "markdown", "id": "2ef90862-aeb4-45b3-87fb-e9df8f197c81", "metadata": { "execution": {} }, "source": [ "For every spatial coordinate, we now have a monthly mean SST for the time period 2000-01-15 to 2014-12-15.\n", "\n", "We can now plot the climatology at a specific point:" ] }, { "cell_type": "code", "execution_count": null, "id": "f908c377-67fa-449c-b8d1-82ba6a14baff", "metadata": { "execution": {}, "executionInfo": { "elapsed": 363, "status": "ok", "timestamp": 1681572685495, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 } }, "outputs": [], "source": [ "tos_clim.sel(lon=310, lat=50, method=\"nearest\").plot()" ] }, { "cell_type": "markdown", "id": "f91dc01d", "metadata": { "execution": {} }, "source": [ "Based on this plot, the climatology of this location is defined by cooler SST from December to April and warmer SST from June to October, with an annual SST range of ~8ºC. " ] }, { "cell_type": "markdown", "id": "3d71dab2-5e20-40f6-af5e-8e446c387803", "metadata": { "execution": {} }, "source": [ "#### Questions 1.2.1: Climate Connection\n", "\n", "1. Considering the latitude and longitude of this data, can you explain why we observe this climatology?\n", "2. How do you think seasonal variations in SST would differ at the equator? What about at the poles? What about at 50ºS?" ] }, { "cell_type": "code", "execution_count": null, "id": "fb839b81-6bc3-4cf8-8d06-748a3770cd22", "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# to_remove explanation\n", "\n", "\"\"\"\n", "1. This location's strong seasonality suggests that it experiences significant seasonal changes in incoming solar radiation received at the surface over the course of a year.\n", "2. Seasonal variations in sea surface temperature (SST) would differ at different latitudes. At the equator, the seasonal variations in SST tend to be relatively small. This is because the equatorial region experiences a relatively constant amount of solar radiation throughout the year, resulting in a relatively stable SST. At the poles, however, the seasonal variations in SST are more pronounced. At 50ºS, the seasonal cycle would be opposite that shown here because the northern and southern hemispheres experience their seasons at different times of year (that is northern hemisphere summer is southern hemisphere winter).\n", "\"\"\";" ] }, { "cell_type": "markdown", "id": "d7c540b2-c707-4d51-8cc7-3ea51b8a505f", "metadata": { "execution": {} }, "source": [ "### Section 1.2.2: Spatial Variations" ] }, { "cell_type": "markdown", "id": "0e5dc34b-99bc-494b-9c04-ed8388ab2e6c", "metadata": { "execution": {} }, "source": [ "We can now add a spatial dimension to this plot and look at the zonal mean climatology (the monthly mean SST at different latitudes):" ] }, { "cell_type": "code", "execution_count": null, "id": "22c61c11-2a48-4c6c-8009-6f20e0101237", "metadata": { "execution": {}, "executionInfo": { "elapsed": 584, "status": "ok", "timestamp": 1681572688734, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 } }, "outputs": [], "source": [ "tos_clim.mean(dim=\"lon\").transpose().plot.contourf(levels=12, cmap=\"turbo\")" ] }, { "cell_type": "markdown", "id": "3411ebb7-9831-4e52-ab2e-7e4e7a1356ee", "metadata": { "execution": {} }, "source": [ "This gives us helpful information about the mean SST for each month, but it's difficult to asses the range of monthly temperatures throughout the year using this plot.\n", "\n", "To better represent the range of SST, we can calculate and plot the difference between January and July climatologies:" ] }, { "cell_type": "code", "execution_count": null, "id": "19a2808b-81f9-40e5-ab31-d63bfce85eae", "metadata": { "execution": {}, "executionInfo": { "elapsed": 1075, "status": "ok", "timestamp": 1681572692242, "user": { "displayName": "Sloane Garelick", "userId": "04706287370408131987" }, "user_tz": 240 } }, "outputs": [], "source": [ "(tos_clim.sel(month=1) - tos_clim.sel(month=7)).plot(size=6, robust=True)" ] }, { "cell_type": "markdown", "id": "ddf896bb", "metadata": { "execution": {} }, "source": [ "#### Questions 1.2.1: Climate Connection\n", "\n", "1. What patterns do you observe in this map?\n", "2. Why is there such an apparent difference between the Northern and Southern Hemisphere SST changes?\n", "3. How does the migration of the ITCZ relate to seasonal changes in Northern vs. Southern Hemisphere temperatures?" ] }, { "cell_type": "code", "execution_count": null, "id": "1ce210a0-f392-448d-a2bf-cd27ad370d99", "metadata": { "execution": {}, "tags": [] }, "outputs": [], "source": [ "# to_remove explanation\n", "\n", "\"\"\"\n", "1. Generally the difference is negative in the northern hemisphere and positive in the southern hemisphere.\n", "2. The hemispheres have opposing signs, which is because of Earth's tilt and how it affects the timing of the seasons in each hemisphere (i.e. northern hemisphere summer is southern hemisphere winter). In the northern hemisphere the difference is negative because northern hemisphere July is warmer than northern hemisphere January. In the southern hemisphere, the opposite is true.\n", "3. The ITCZ shifts with the seasons. During the time when the Sun is directly overhead at the Northern Tropics (June), the ITCZ moves northwards, and when the Sun is directly overhead at the Southern Tropics (December), the ITCZ shifts southwards. This migration can affect the distribution of rain, leading to wet and dry seasons in many tropical regions.\n", "\"\"\";" ] }, { "cell_type": "markdown", "id": "ee5f2987-d7e3-4b0d-a5ad-abfae5a98072", "metadata": { "execution": {} }, "source": [ "# Summary\n", "\n", "In this tutorial, we focused on exploring seasonal climatology in global temperature data using the split-apply-combine approach in Xarray. By utilizing the split-apply-combine approach, we gained insights into the seasonal climatology of temperature and precipitation data, enabling us to analyze and understand the seasonal variations associated with global climate patterns.\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "edf206e2-7eac-46b2-9037-dc8ba640a856", "metadata": { "execution": {} }, "source": [ "# Resources" ] }, { "cell_type": "markdown", "id": "1950bbe7-82a6-4b85-aa7f-01c29adfcc40", "metadata": { "execution": {} }, "source": [ "Code and data for this tutorial is based on existing content from [Project Pythia](https://foundations.projectpythia.org/core/xarray/computation-masking.html)." ] } ], "metadata": { "colab": { "collapsed_sections": [], "include_colab_link": true, "name": "W1D1_Tutorial5", "provenance": [], "toc_visible": true }, "kernel": { "display_name": "Python 3", "language": "python", "name": "python3" }, "kernelspec": { "display_name": "climatematch", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 5 }