School of Meteorology (Defense)

Coupling Data Science Techniques and Numerical Weather Prediction Models for High-Impact Weather Prediction

David John Gagne

School of Meteorology

21 July 2016, 9:00 AM

National Weather Center, Room 4140
120 David L. Boren Blvd.
University of Oklahoma
Norman, OK

Meteorologists have access to more model guidance and observations than ever before, but more information does not necessarily lead to better forecasts. New tools are needed to reduce the cognitive load on forecasters and provide them with accurate, reliable consensus guidance. Techniques from the data science community, such as machine learning and image processing, have the potential to summarize and calibrate numerical weather prediction model output and generate deterministic and probabilistic forecasts of high-impact weather. In this dissertation, data science techniques have been developed and applied to the prediction of severe hail and solar irradiance.

Hail forecasts were produced with convection-allowing model output from the Center for Analysis and Prediction of Storms and National Center for Atmospheric Research ensembles and were compared against storm surrogate variables and physics-based diagnostic models of hail size. Initial machine learning hail forecasts reduced size errors but struggled with predicting extreme events. By coupling the machine learning model to predicting hail size distributions and estimating the distribution parameters jointly, the machine learning methods were able to show skill and reliability in predicting both severe and significant hail.

Machine learning model and data configurations for gridded solar irradiance forecasting were evaluated on two numerical modeling systems. The evaluation determined how machine learning model choice, closeness of fit to training data, training data aggregation, and interpolation method affected forecasts of clearness index at Oklahoma Mesonet sites not included in the training data. The choice of machine learning model, interpolation scheme, and machine learning model complexity had the biggest impacts on performance. Minimizing the forecast error did not guarantee that the predictions captured the distribution of the observations correctly, and more complex model fits helped address that issue. Performance tended to be better at testing sites with sunnier weather and those that were closer to training sites.

School of Meteorology (Defense) Seminar Series website