Speaker
Description
In the last two decades, several techniques have been developed to predict the occurrence of Solar Proton Events (SPEs), which are mainly based on the statistical association between the >10 MeV proton flux and precursor parameters. In this framework, the Empirical model for Solar Proton Event Real Time Alert (ESPERTA) provides a prediction of SPEs after the occurrence of ≥M2 solar flares, by taking as input parameters the heliolongitude of the flare source region, the soft X-ray fluence and the radio fluence at ~1 MHz (Laurenza et al., 2009). In this work we reinterpret the ESPERTA model in the framework of machine learning and, we apply the rare events corrections (i.e. SMOTE oversampling as well as the modified class weighted loss function). We find that by applying a cut-off on the heliolongitude of ≥M2 solar flares, we are able to reduce the False Alarm Rate (FAR) of the model. The cut-off is set to E20°, where the cumulative distribution of ≥M2 SPE-associated flares shows a break, which reflects the poor magnetic connection between the Earth and the Sun for eastern events. The best performances are obtained by using the SMOTE algorithm, leading to Probability Of Detection (POD) of 0.83 and a FAR of 0.39. Nevertheless, despite the implementation of rare events corrections to the model, we demonstrate that a relevant FAR on the predictions is a natural consequence of the sample base rates. Indeed, from a Bayesian point of view, we found that the FAR contains explicitly the prior knowledge about the class distributions. This is a critical issue of any statistical approach, thus we discuss the importance of performing the model validation by preserving the class distributions within training and test datasets.