Daniel Kubalek

Master's Thesis Defense impacts of multi-scale predictors on random forest based probabilistic forecasts of severe weather hazards  Friday, April 26th, 2024  NWC 1313 / 1:00 pm  If unable to attend in person Join Zoom:   https://oklahoma.zoom.us/j/7757213580?pwd=MWUwMkhlZ1BmRjlXbnVraVFUSjFMZz09  Abstract: Machine learning (ML) algorithms utilized for post-processing of convection-allowing model/ensemble (CAM/CAE) output has been

Start

April 26, 2024 - 1:00 pm

End

April 26, 2024 - 3:00 pm

Master’s Thesis Defense

impacts of multi-scale predictors on random forest based probabilistic forecasts of severe weather hazards 

Friday, April 26th, 2024 

NWC 1313 / 1:00 pm 

If unable to attend in person Join Zoom:  

https://oklahoma.zoom.us/j/7757213580?pwd=MWUwMkhlZ1BmRjlXbnVraVFUSjFMZz09 

Abstract:

Machine learning (ML) algorithms utilized for post-processing of convection-allowing model/ensemble (CAM/CAE) output has been a major area of research to handle limitations with CAM/CAE forecasts. Such as to correct systematic biases, relate observed variables to numerical output, and synthesize extremely large data into probabilistic forecasts. In particular, numerous studies have shown random forests (RFs) success in severe weather forecasting applications utilizing predictors from global scale and/or CAE output. However, one thing that all these studies have in common is that the predictors used in the RF models are fixed and considered independent when training the RF models. This can consequently leave out important information about the large-scale flow pattern that is necessary for assessing severe weather risk. This paper develops a method for manifesting flow-dependency into RF models through direct incorporation of CAE-based predictors that are pre-processed at increasing spatial length scales to account for the different scales of motion to improve probabilistic forecast skill for a variety of severe weather hazards for next-day (12-12 UTC) – or 24hr – and 4hr (20-00 UTC) forecasts. In order to verify the impacts of the multiscale predictors on the skill of the RF models, a control (CTLRF) and experimental (EXPRF) set of RF models were created. The CTLRF models were trained with only predictors pre-processed to 80 kilometers (km) and the EXPRF models were trained with predictors pre-processed to 80 km in addition to increasing larger spatially smoothed 80 km predictors. Both models were verified against the storm prediction center (SPC) reports quantitatively and qualitatively. Results show that the EXPRF models had higher brier skill score (BSS’s) than the CTLRF models for all sub-significant severe weather hazards for both forecast periods, but significantly higher BSS’s when forecasting any severe weather hazard (24 hr and 4 hr), wind (24 hr), hail (24 hr), and significant winds (24hr). The EXPRF forecasts generally had the best resolution of which some forecasts were significantly higher than CTLRF forecasts. However, neither model had significant advantage in reliability over the other, mostly suggesting further calibration would be needed. Through interpretations via tree interpreter (TI) it was shown that, on average, for 24hr forecasts , the fixed 80 km predictors slightly contributed to skill (i.e., decreased probabilities) whereas the multiscale predictors dominated contributions to skill in the 4hr forecast period when locations did not have severe reports. Overall, when locations did receive severe weather reports, the multiscale storm-attribute predictors made up a large portion of contributions to forecast skill (though not always the largest), but in general the multiscale predictors were utilized to further increase forecast skill across most severe weather hazards. Lastly, it was shown that the multiscale storm-attribute predictors mostly contributed to skill due to accounting for location uncertainty that the fixed 80 km storm-attribute predictors could not. Meanwhile for the environment predictors, particularly the convective environment predictors, had greater sensitivity to smoothing and sometimes did not benefit losing sharp gradients and local extremes.