30 April 2024 to 3 May 2024
Amsterdam, Hotel CASA
Europe/Amsterdam timezone

Fair Universe HiggsML Uncertainty Challenge

2 May 2024, 16:00
20m
Sorbonne, Hotel CASA

Sorbonne, Hotel CASA

Speaker

David Rousseau (IJCLab-Orsay)

Description

The Fair Universe project is building a large-compute-scale AI ecosystem for sharing datasets, training large models and hosting challenges and benchmarks. Furthermore, the project is exploiting this ecosystem for an AI challenge series focused on minimizing the effects of systematic uncertainties in High-Energy Physics (HEP), and on predicting accurate confidence intervals.

This talk will describe the challenge platform we have developed that builds on the open-source benchmark ecosystem Codabench to interface it to the NERSC HPC center and its Perlmutter system with over 7000 A100 GPUs.

This presentation will also advertise the first of our Fair Universe public challenges hosted on this platform, the Fair Universe: HiggsML Uncertainty Challenge, which will run over summer 2024.

This challenge will present participants with a much larger training dataset (than previous similar competitions) corresponding to H to tau tau cross section measurement at the Large Hadron Collider, from four-vectors of the final state. They should design an advanced analysis technique able to not just measure the signal strength but to provide a confidence interval, from which correct coverage will be evaluated automatically from pseudo-experiments. The confidence interval should include statistical uncertainty and also systematic uncertainties (concerning detector calibration, background levels etc…). It is expected that advanced analysis techniques that are able to control the impact of systematics will perform best, thereby pushing the field of uncertainty aware AI techniques for HEP and beyond.

The Codabench/NERSC platform also allows for hosting challenges also from other communities, and we also intend to make our benchmark designs available as templates so similar efforts can be easily launched in other domains.

This contribution describes work that pushes the state of the art in the ML Challenge platform; the ML benchmarks and challenge itself; and in the evaluation of uncertainty-aware methods.

For the platform we describe a system capable of operating at much larger scale than other approaches, including on large datasets and trained and evaluated on multiple GPUs in parallel. The platform also provides a leaderboard and ecosystem for long-lived benchmarks, as well as capabilities to not only evaluate different models but also test models against new datasets.

For the “Fair Universe: HiggsML Uncertainty Challenge” we provide larger datasets, with multiple systematic uncertainties applied, as well as evaluation of uncertainties as part of the challenge, performed on multiple pseudo-experiments. All of these aspects are novel to HEP ML challenges as far as we are aware.

Furthermore we will present methodological innovations including novel metrics for evaluation of uncertainty aware methods as well as improvements in uncertainty aware methods themselves.

Primary authors

Presentation materials