{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"[](https://colab.research.google.com/github/ClimateMatchAcademy/course-content/blob/main/tutorials/W2D2_GoodResearchPractices/W2D2_Tutorial6.ipynb)
"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial 6: Implementing the Analysis\n",
"\n",
"**Good Research Practices**\n",
"\n",
"**Content creators:** Yuxin Zhou, Marguerite Brown, Zane Mitrevica, Natalie Steinemann\n",
"\n",
"**Content reviewers:** Sherry Mi, Maria Gonzalez, Nahid Hasan, Beatriz Cosenza Muralles, Katrina Dobson, Sloane Garelick, Cheng Zhang\n",
"\n",
"**Content editors:** Jenna Pearson, Chi Zhang, Ohad Zivan\n",
"\n",
"**Production editors:** Wesley Banfield, Jenna Pearson, Chi Zhang, Ohad Zivan\n",
"\n",
"**Our 2023 Sponsors:** NASA TOPS and Google DeepMind"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Tutorials Objectives\n",
"\n",
"In Tutorials 5-8, you will learn about the research process. This includes how to\n",
"\n",
"5. Draft analyses of data to test a hypothesis\n",
"6. Implement analysis of data\n",
"7. Interpret results in the context of existing knowledge\n",
"8. Communicate your results and conclusions\n",
"\n",
"By the end of these tutorials you will be able to:\n",
"\n",
"* Understand the principles of good research practices\n",
"* Learn to view a scientific data set or question through the lens of equity: Who is represented by this data and who is not? Who has access to this information? Who is in a position to use it?\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Activity: Implement the Analysis\n",
"\n",
"In this tutorial, you will be implementing a linear regression model as outlined in Step 5 on real-world CO2 and temperature records.\n",
"\n",
"The CO2 and temperature records we will be analyzing are both examples of paleoclimate data (for more information, refer back to Step 3). The CO2 record (Bereiter et al., 2015) was generated by measuring the CO2 concentration in ancient air bubbles trapped inside ice from multiple ice cores retrieved from Antarctica. The temperature record (Shakun et al., 2015) is based on chemical analysis done on the shells of planktic foraminifera. The foraminifera shells were identified and picked from deep-sea sediments, and the temperature record combined multiple sea-surface temperature records from a range of sites globally.\n",
"\n",
"Why are we focusing on these two records specifically? The CO2 record from Antarctic ice core is the gold standard of air CO2 variability on glacial-interglacial time scales, and it has a temporal resolution unmatched by any other reconstruction methods. The temperature record comes from sediment cores all over the global ocean, and therefore is likely representative of the global surface ocean temperature variability. Polar air temperature records are also available from ice core studies, but such records may represent an exaggerated view of the global temperature because of polar amplification.\n",
"\n",
"If you would like to learn more, the data sources are listed at the bottom of the page.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 3940,
"status": "ok",
"timestamp": 1682775919083,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"# imports\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import numpy as np\n",
"from scipy import interpolate\n",
"from scipy import stats\n",
"import os\n",
"import pooch\n",
"import tempfile"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Helper functions\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Helper functions\n",
"\n",
"\n",
"def pooch_load(filelocation=None, filename=None, processor=None):\n",
" shared_location = \"/home/jovyan/shared/Data/tutorials/W2D1_FutureClimate-IPCCIPhysicalBasis\" # this is different for each day\n",
" user_temp_cache = tempfile.gettempdir()\n",
"\n",
" if os.path.exists(os.path.join(shared_location, filename)):\n",
" file = os.path.join(shared_location, filename)\n",
" else:\n",
" file = pooch.retrieve(\n",
" filelocation,\n",
" known_hash=None,\n",
" fname=os.path.join(user_temp_cache, filename),\n",
" processor=processor,\n",
" )\n",
"\n",
" return file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": []
},
"outputs": [],
"source": [
"# time series\n",
"# read SST data \"Shakun2015_SST.txt\"\n",
"filename_Shakun2015_SST = \"Shakun2015_SST.txt\"\n",
"url_Shakun2015_SST = \"https://osf.io/kmy5w/download\"\n",
"SST = pd.read_table(pooch_load(url_Shakun2015_SST, filename_Shakun2015_SST))\n",
"SST.set_index(\"Age\", inplace=True)\n",
"SST"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": []
},
"outputs": [],
"source": [
"# read CO2 dataantarctica2015co2composite_cleaned.txt\n",
"filename_antarctica2015co2composite_cleaned = \"antarctica2015co2composite_cleaned.txt\"\n",
"url_antarctica2015co2composite_cleaned = \"https://osf.io/45fev/download\"\n",
"CO2 = pd.read_table(\n",
" pooch_load(\n",
" url_antarctica2015co2composite_cleaned,\n",
" filename_antarctica2015co2composite_cleaned,\n",
" )\n",
")\n",
"CO2.set_index(\"age_gas_calBP\", inplace=True)\n",
"CO2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 5037,
"status": "ok",
"timestamp": 1682775959771,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"# plot\n",
"# set up two subplots in a grid of 2 rows and 1 column\n",
"# also make sure the two plots share the same x(time) axis\n",
"fig, axes = plt.subplots(2, 1, sharex=True)\n",
"# move the two subplots closer to each other\n",
"fig.subplots_adjust(hspace=-0.5)\n",
"axes[0].plot(SST.index, SST[\"SST stack\"], color=\"C4\")\n",
"axes[1].plot(CO2.index / 1000, CO2[\"co2_ppm\"], color=\"C1\")\n",
"\n",
"# beautification\n",
"# since sharex=True in plt.subplots(), this sets the x axis limit for both panels\n",
"axes[1].set_xlim((0, 805))\n",
"# axis labels\n",
"axes[1].set_xlabel(\"Age (ka BP)\")\n",
"axes[0].set_ylabel(r\"Sea Surface Temperature\" \"\\n\" \"detrended (°C)\", color=\"C4\")\n",
"axes[1].set_ylabel(r\"CO${}_\\mathrm{2}$ (ppm)\", color=\"C1\")\n",
"\n",
"# despine makes the plots look cleaner\n",
"sns.despine(ax=axes[0], top=True, right=False, bottom=True, left=True)\n",
"sns.despine(ax=axes[1], top=True, right=True, bottom=False, left=False)\n",
"# clean up top panel x axis ticks\n",
"axes[0].xaxis.set_ticks_position(\"none\")\n",
"# move top panel xlabel to the right side\n",
"axes[0].yaxis.set_label_position(\"right\")\n",
"# the following code ensures the subplots don't overlap\n",
"for ax in axes:\n",
" ax.set_zorder(10)\n",
" ax.set_facecolor(\"none\")\n",
"# color the axis\n",
"axes[0].spines[\"right\"].set_color(\"C4\")\n",
"axes[1].spines[\"left\"].set_color(\"C1\")\n",
"axes[0].tick_params(axis=\"y\", colors=\"C4\")\n",
"axes[1].tick_params(axis=\"y\", colors=\"C1\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"Now that we've taken a look at the two time series, let's make a scatter plot between them and fit a linear regression model through the data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": []
},
"outputs": [],
"source": [
"# in this code block, we will make a scatter plot of CO2 and temperature\n",
"# and fit a linear regression model through the data\n",
"\n",
"\n",
"def age_model_interp(CO2_age, CO2, SST_age):\n",
" \"\"\"\n",
" This helper function linearly interpolates CO2 data, which\n",
" have a very high temporal resolution, to temperature data,\n",
" which have a relatively low resolution\n",
" \"\"\"\n",
" f = interpolate.interp1d(CO2_age, CO2)\n",
" all_ages = f(SST_age)\n",
" return all_ages\n",
"\n",
"\n",
"# interpolate CO2 data to SST age\n",
"CO2_interpolated = age_model_interp(CO2.index / 1000, CO2[\"co2_ppm\"], SST.index)\n",
"\n",
"# plot\n",
"# set up two subplots in a grid of 2 rows and 1 column\n",
"# also make sure the two plots share the same x(time) axis\n",
"fig, ax = plt.subplots(1, 1, sharex=True)\n",
"\n",
"ax.scatter(CO2_interpolated, SST[\"SST stack\"], color=\"gray\")\n",
"\n",
"# regression\n",
"X = CO2_interpolated\n",
"y = SST[\"SST stack\"]\n",
"res = stats.linregress(X, y) # ordinary least sqaure\n",
"\n",
"x_fit = np.arange(180, 280)\n",
"# intercept\n",
"y_fit = x_fit * res.slope + res.intercept\n",
"ax.plot(x_fit, y_fit, color=\"k\")\n",
"\n",
"# beautification\n",
"# axis labels\n",
"ax.set_xlabel(r\"CO${}_\\mathrm{2}$ (ppm)\")\n",
"ax.set_ylabel(r\"Sea Surface Temperature\" \"\\n\" \"detrended (°C)\")\n",
"print(\n",
" \"pearson (r^2) value: \"\n",
" + \"{:.2f}\".format(res.rvalue**2)\n",
" + \" \\nwith a p-value of: \"\n",
" + \"{:.2e}\".format(res.pvalue)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Figure Making Through the Equity Lense\n",
"\n",
"Click here for some information
\n",
"Are the colors in your figure distinguishable for people with color-vision deficiencies?\n",
"\n",
"More readings on this topic:\n",
"\n",
"Contrast checker: https://www.color-blindness.com/coblis-color-blindness-simulator/\n",
"\n",
"Coloring for color blindness: https://davidmathlogic.com/colorblind\n",
"\n",
"Python-specific color palettes that are friendly to those with color-vision deficiency: https://seaborn.pydata.org/tutorial/color_palettes.html"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Resources\n",
"\n",
"Data from the following sources are used in this tutorial:\n",
"\n",
"CO2: Bereiter, B., Eggleston, S., Schmitt, J., Nehrbass-Ahles, C., Stocker, T.F., Fischer, H., Kipfstuhl, S., Chappellaz, J., 2015. Revision of the EPICA Dome C CO2 record from 800 to 600 kyr before present. Geophysical Research Letters 42, 542–549. https://doi.org/10.1002/2014GL061957\n",
"\n",
"Temperature: Shakun, J.D., Lea, D.W., Lisiecki, L.E., Raymo, M.E., 2015. An 800-kyr record of global surface ocean δ18O and implications for ice volume-temperature coupling. Earth and Planetary Science Letters 426, 58–68. https://doi.org/10.1016/j.epsl.2015.05.042\n",
"\n",
"\n"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"include_colab_link": true,
"name": "Projects_Tutorial6",
"provenance": [
{
"file_id": "108RRAFBnnKvDTfEDC0Fm5qHZez32HB69",
"timestamp": 1680091091012
},
{
"file_id": "1WfT8oN22xywtecNriLptqi1SuGUSoIlc",
"timestamp": 1680037587733
}
],
"toc_visible": true
},
"kernel": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}