Temperature Records
Within this notebook, you will learn how to work with the GHCNd dataset, which includes critical climate data from around the world, including the City of Chicago, Illinois.
Overview
If you have an introductory paragraph, lead with it here! Keep it short and tied to your material, then be sure to continue into the required list of topics below,
Extract daily minumum and maximum temperature data from the GHCNd dataset
Plot + visualize the dataset, locating the extreme events
Highlight key dates - look at record warmth and cold
Calculate if these record-breaking events are significant, using statistics!
Journal space: Do you remember either of the record-breaking temperature dates?
Prerequisites
Label the importance of each concept explicitly as helpful/necessary.
Concepts |
Importance |
Notes |
---|---|---|
Necessary |
||
Helpful |
Familiarity with metadata structure |
|
Project management |
Helpful |
Time to learn: estimate in minutes. For a rough idea, use 5 mins per subsection, 10 if longer; add these up for a total. Safer to round up and overestimate.
System requirements:
Populate with any system, version, or non-Python software requirements if necessary
Otherwise use the concepts table above and the Imports section below to describe required packages as necessary
If no extra requirements, remove the System requirements point altogether
Imports
Begin your body of content with another ---
divider before continuing into this section, then remove this body text and populate the following code cell with all necessary Python imports up-front:
import calendar
import numpy as np
import pandas as pd
import holoviews as hv
import hvplot.pandas
from distributed import LocalCluster
import cartopy.crs as ccrs
hv.extension('bokeh')
Extract daily minumum and maximum temperature data from the GHCNd dataset
station = 'USC00111577'
ohare = pd.read_parquet(f's3://noaa-ghcn-pds/parquet/by_station/STATION={station}/',
storage_options={'anon':True})
ohare
ID | DATE | DATA_VALUE | M_FLAG | Q_FLAG | S_FLAG | OBS_TIME | ELEMENT | |
---|---|---|---|---|---|---|---|---|
0 | USC00111577 | 19650101 | 100 | None | None | X | None | ACMH |
1 | USC00111577 | 19650102 | 70 | None | None | X | None | ACMH |
2 | USC00111577 | 19650103 | 60 | None | None | X | None | ACMH |
3 | USC00111577 | 19650104 | 70 | None | None | X | None | ACMH |
4 | USC00111577 | 19650105 | 90 | None | None | X | None | ACMH |
... | ... | ... | ... | ... | ... | ... | ... | ... |
275463 | USC00111577 | 19791201 | 1 | None | None | X | None | WT18 |
275464 | USC00111577 | 19791207 | 1 | None | None | X | None | WT18 |
275465 | USC00111577 | 19791208 | 1 | None | None | X | None | WT18 |
275466 | USC00111577 | 19791216 | 1 | None | None | X | None | WT18 |
275467 | USC00111577 | 19791225 | 1 | None | None | X | None | WT18 |
275468 rows × 8 columns
Clean the Data
# Clean up the dates
ohare['DATE'] = pd.to_datetime(ohare.DATE)
# Subset for tmax, tmin, and precipitation
tmax = ohare.loc[ohare.ELEMENT == 'TMAX']
tmin = ohare.loc[ohare.ELEMENT == 'TMIN']
precipitation = ohare.loc[ohare.ELEMENT == 'PRCP']
# Rename the columns, set the index to the date
tmax_df = tmax[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'TMAX'}).set_index('DATE')
tmin_df = tmin[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'TMIN'}).set_index('DATE')
precip_df = precipitation[['DATE', 'DATA_VALUE']].rename(columns={'DATA_VALUE': 'PRCP'}).set_index('DATE')
# Convert to degrees Celsius
tmax_df['TMAX'] = tmax_df.TMAX/10
tmin_df['TMIN'] = tmin_df.TMIN/10
df = pd.DataFrame({'TMAX':tmax_df.TMAX,
'TMIN':tmin_df.TMIN})
subset = df.loc[(df.index.month == 7)]
subset.TMAX.hvplot.scatter(xlabel='Year',
ylabel='Temperature (degC)',
title='July Maximum Temperature at Midway Airport')
Last Section
If you’re comfortable, and as we briefly used for our embedded logo up top, you can embed raw html into Jupyter Markdown cells (edit to see):
Info
Your relevant information here!
Feel free to copy this around and edit or play around with yourself. Some other admonitions
you can put in:
Success
We got this done after all!
Warning
Be careful!
Danger
Scary stuff be here.
We also suggest checking out Jupyter Book’s brief demonstration on adding cell tags to your cells in Jupyter Notebook, Lab, or manually. Using these cell tags can allow you to customize how your code content is displayed and even demonstrate errors without altogether crashing our loyal army of machines!
Summary
Add one final ---
marking the end of your body of content, and then conclude with a brief single paragraph summarizing at a high level the key pieces that were learned and how they tied to your objectives. Look to reiterate what the most important takeaways were.
Resources and references
Finally, be rigorous in your citations and references as necessary. Give credit where credit is due. Also, feel free to link to relevant external material, further reading, documentation, etc. Then you’re done! Give yourself a quick review, a high five, and send us a pull request. A few final notes:
Kernel > Restart Kernel and Run All Cells...
to confirm that your notebook will cleanly run from start to finishKernel > Restart Kernel and Clear All Outputs...
before committing your notebook, our machines will do the heavy liftingTake credit! Provide author contact information if you’d like; if so, consider adding information here at the bottom of your notebook
Give credit! Attribute appropriate authorship for referenced code, information, images, etc.
Only include what you’re legally allowed: no copyright infringement or plagiarism
Thank you for your contribution!