Temperature Records

Within this notebook, you will learn how to work with the GHCNd dataset, which includes critical climate data from around the world, including the City of Chicago, Illinois.

Overview

If you have an introductory paragraph, lead with it here! Keep it short and tied to your material, then be sure to continue into the required list of topics below,

Extract daily minumum and maximum temperature data from the GHCNd dataset
Plot + visualize the dataset, locating the extreme events
Highlight key dates - look at record warmth and cold
Calculate if these record-breaking events are significant, using statistics!
Journal space: Do you remember either of the record-breaking temperature dates?

Prerequisites

Label the importance of each concept explicitly as helpful/necessary.

Concepts	Importance	Notes
Intro to Cartopy	Necessary
Understanding of NetCDF	Helpful	Familiarity with metadata structure
Project management	Helpful

Time to learn: estimate in minutes. For a rough idea, use 5 mins per subsection, 10 if longer; add these up for a total. Safer to round up and overestimate.
System requirements:
- Populate with any system, version, or non-Python software requirements if necessary
- Otherwise use the concepts table above and the Imports section below to describe required packages as necessary
- If no extra requirements, remove the System requirements point altogether

Imports

Begin your body of content with another --- divider before continuing into this section, then remove this body text and populate the following code cell with all necessary Python imports up-front:

import calendar

import numpy as np
import pandas as pd
import holoviews as hv
import hvplot.pandas
from distributed import LocalCluster
import cartopy.crs as ccrs
hv.extension('bokeh')

Extract daily minumum and maximum temperature data from the GHCNd dataset

station = 'USC00111577'
ohare = pd.read_parquet(f's3://noaa-ghcn-pds/parquet/by_station/STATION={station}/',
                        storage_options={'anon':True})
ohare

	ID	DATE	DATA_VALUE	M_FLAG	Q_FLAG	S_FLAG	OBS_TIME	ELEMENT
0	USC00111577	19650101	100	None	None	X	None	ACMH
1	USC00111577	19650102	70	None	None	X	None	ACMH
2	USC00111577	19650103	60	None	None	X	None	ACMH
3	USC00111577	19650104	70	None	None	X	None	ACMH
4	USC00111577	19650105	90	None	None	X	None	ACMH
...	...	...	...	...	...	...	...	...
275463	USC00111577	19791201	1	None	None	X	None	WT18
275464	USC00111577	19791207	1	None	None	X	None	WT18
275465	USC00111577	19791208	1	None	None	X	None	WT18
275466	USC00111577	19791216	1	None	None	X	None	WT18
275467	USC00111577	19791225	1	None	None	X	None	WT18

275468 rows × 8 columns

Clean the Data

# Clean up the dates
ohare['DATE'] = pd.to_datetime(ohare.DATE)

# Subset for tmax, tmin, and precipitation
tmax = ohare.loc[ohare.ELEMENT == 'TMAX']
tmin = ohare.loc[ohare.ELEMENT == 'TMIN']
precipitation = ohare.loc[ohare.ELEMENT == 'PRCP']

# Rename the columns, set the index to the date
tmax_df = tmax[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'TMAX'}).set_index('DATE')
tmin_df = tmin[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'TMIN'}).set_index('DATE')
precip_df = precipitation[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'PRCP'}).set_index('DATE')

# Convert to degrees Celsius
tmax_df['TMAX'] = tmax_df.TMAX/10
tmin_df['TMIN'] = tmin_df.TMIN/10

df = pd.DataFrame({'TMAX':tmax_df.TMAX,
                   'TMIN':tmin_df.TMIN})

subset = df.loc[(df.index.month == 7)]

subset.TMAX.hvplot.scatter(xlabel='Year',
                           ylabel='Temperature (degC)',
                           title='July Maximum Temperature at Midway Airport')

Last Section

If you’re comfortable, and as we briefly used for our embedded logo up top, you can embed raw html into Jupyter Markdown cells (edit to see):

Info

Your relevant information here!

Feel free to copy this around and edit or play around with yourself. Some other admonitions you can put in:

Success

We got this done after all!

Warning

Be careful!

Danger

Scary stuff be here.

We also suggest checking out Jupyter Book’s brief demonstration on adding cell tags to your cells in Jupyter Notebook, Lab, or manually. Using these cell tags can allow you to customize how your code content is displayed and even demonstrate errors without altogether crashing our loyal army of machines!

Summary

Add one final --- marking the end of your body of content, and then conclude with a brief single paragraph summarizing at a high level the key pieces that were learned and how they tied to your objectives. Look to reiterate what the most important takeaways were.

What’s next?

Let Jupyter book tie this to the next (sequential) piece of content that people could move on to down below and in the sidebar. However, if this page uniquely enables your reader to tackle other nonsequential concepts throughout this book, or even external content, link to it here!

Resources and references

Finally, be rigorous in your citations and references as necessary. Give credit where credit is due. Also, feel free to link to relevant external material, further reading, documentation, etc. Then you’re done! Give yourself a quick review, a high five, and send us a pull request. A few final notes:

Kernel > Restart Kernel and Run All Cells... to confirm that your notebook will cleanly run from start to finish
Kernel > Restart Kernel and Clear All Outputs... before committing your notebook, our machines will do the heavy lifting
Take credit! Provide author contact information if you’d like; if so, consider adding information here at the bottom of your notebook
Give credit! Attribute appropriate authorship for referenced code, information, images, etc.
Only include what you’re legally allowed: no copyright infringement or plagiarism

Thank you for your contribution!