EPA Air Quality Analysis

Analyze air quality summary statistics by year at the US county level


EPA Air Quality Analysis

BigQuery EPA Air Quality Analysis

Team members:
Javier Orraca
Yiting Fu
Mark Planas

Project Overview

  • Dataset:
  • Deliverables:
    • Interactive dashboard (via Dash or Bokeh) of a US map, showing how air quality has changed from 1950 to current date
  • Project:
    • Analyze air quality summary statistics by year at the US county level
    • If time permits, build a neural network via TensorFlow or PyTorch, using a pipeline created in Spark / Databricks, and made available through an interactive dashboard
  • Research Questions:
    • How US air quality changes through time at the county level;
    • What factors could cause these changes (potentially researching external data sets for understanding “why” US air quality got better or worse, etc.)

BigQuery Data Accessibility

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.epa_historical_air_quality.[TABLENAME].

Code

Please see the Jupyter Notebook for details.

Slides

Download PDF


Go Back to Homepage