COLLOQUIUM "Using Hybrid Training and Graph Neural Networks for improving particle physics analysis" by Troels Petersen (Copenhagen NBI)


In particle physics, algorithms are trained on simulated data (MC) and then applied to the real data. Though everything is done to ensure identical distributions in data and MC, they are almost surely not the same, leading to suboptimal performance and potential biases in the data. However, in many situations it is possible to obtain “approximate labels” in data through a Tag&Probe approaches in control channels. Such labels are usually not powerful enough to obtain good training results from data alone, but combining data with MC for simultaneous “hybrid training” allows ML algorithms to learn the general relations mainly from the perfectly labelled MC data, while at the same time learning the smaller adoptions needed for optimal performance in data. The approach will be shown through an example with electron energy regression in ATLAS.
An additional problem in applying ML techniques to particle physics data is geometrically complex and sparse data, of varying size, which does not fit a tabular format (i.e. same number of input variables for each case). While such cases are hard for both likelihood methods and “simple” Machine Learning (ML) algorithms, it fits a Graph Neural Network (GNN) perfectly, as they are build for such geometric cases. Considering the IceCube experiment on the South Pole, which consists of 5000+ optical modules embedded in a billion tons of Antarctic ice, I will show how the GNN approach solves both the geometric complication and the non-fixed input size elegantly, and how GNNs can be used in many places and further boosted with a transformer architecture.

The agenda of this meeting is empty