April 16, 2021 - 11:00 am
April 16, 2021 - 12:00 pm
CategoriesSchool of Meteorology Colloquium
School of Meteorology Colloquium
The Creation and Analysis of Next-Day Random Forest-Based High-Impact Weather Forecasts
Friday, April 16th
Join Google Meet:
High-impact weather events—including flash floods, tornadoes, damaging winds, and large hail—are difficult to predict, even with high-resolution numerical weather prediction (NWP) models, due to initial condition and model errors. NWP ensembles are designed to account for uncertainties in initial conditions and model physics, but they still contain biases and spatial displacement errors. Moreover, they lack horizontal grid-spacing fine enough to explicitly simulate some high-impact hazards (e.g., severe hail and tornadoes). Thus, ensemble-derived hazard probabilities are frequently suboptimal. Machine learning (ML) techniques provide an important way to obtain probabilistic hazard guidance by (non-linearly) relating NWP forecast variables with relevant observed variables. One ML method, the random forest (RF), is ideally suited for probabilistic hazard prediction owing to its ability to handle biased predictors, resistance to over-fitting, tendency to produce reliable output probabilities, and relative ease of use. This talk explores the development, evaluation, and analysis of an RF-based technique for creating next-day probabilistic precipitation and severe weather forecasts.
In general, RF forecast probabilities are found to be skillful and reliable for precipitation as well as severe weather hazards. RF post-processing is found to be most beneficial for convection-parameterizing ensembles (which have more initial biases than convection-allowing ensembles) and more-common events (e.g., lighter precipitation thresholds and severe wind and hail compared to tornadoes). For precipitation prediction, RF forecasts have smaller spatial biases as well as better resolution and reliability compared to raw or spatially smoothed ensemble forecasts. For severe weather, RF forecasts tend to have verification metrics similar to or better than corresponding Storm Prediction Center (SPC) day-1 human forecasts for most hazards, seasons, and regions. This result is only partly due to the RFs’ ability to generate continuous forecast probabilities.
An analysis of severe-weather-predicting RFs shows that storm predictors are most important, followed by index and environment predictors, although RFs that use both storm and index predictors are the most skillful. Further, RFs that use ensemble mean predictors are more skillful than those that use individual member predictors, since ensemble mean fields contain less noise. Interpretability techniques based on the Tree Interpreter Python module suggest that RFs learn physically-relevant relationships between NWP variables and observed severe weather. These relationships are more nuanced than those currently used for ensemble-based severe weather prediction.