Predicting Flight Delays Caused by Weather: A Big Data and Spark-Driven Approach

While at Berkeley, I worked with some friends on a problem that has plagued all of us for the past couple years - flight delays! We've all experienced it: sitting at your gate, looking at your weather app and seeing thunderstorms start to form and the wind start to pick up. And yet, the airline remains vague about whether your flight will be delayed or not - or even if it will make it out at all!

We analyzed data from the Department of Travel (DOT) and National Oceanic and Atmospheric Administration (NOAA) and used data science and machine learning techniques to predict deviance from expected departure time. Because this problem dealt with hundreds of GB of data, we used Spark to handle this data efficiently.

See our presentation deck for more information. Notebook and full report available upon request.

Previous
Previous

Roboboy Swamp Escape

Next
Next

Analysis of Billboard Year-End Top 100 (2018-2020)