The experimental particle physics data analysis pipeline represents a complex sequence of stages, from raw data acquisition to final physics results. Each stage presents numerous challenges that define the scientific output of every experiment, including those that are the focus of this thesis: the Jiangmen Underground Neutrino Observatory (JUNO) and the Large Hadron Collider beauty (LHCb) experiment. These challenges can be experiment-specific, field-specific, or universal, such as those related to Data Quality Monitoring (DQM) systems. As particle physics experiments grow in scale and complexity, traditional analysis methods increasingly struggle to efficiently extract the full value from the rich datasets the experiments generate. Machine learning (ML) has proven to be a powerful toolset for addressing these diverse challenges, offering novel approaches that can significantly enhance our current data analysis pipelines. This thesis comprises several interconnected studies demonstrating how ML techniques can be adopted across the following stages of the analysis pipeline: (i) DQM with LHCb; and with JUNO: (ii) detector response understanding and Monte Carlo tuning, (iii) event reconstruction, and (iv) event selection. First, I introduce DINAMO (Dynamic and Interpretable Anomaly Monitoring), a novel experiment-agnostic framework that automates DQM by constructing evolving histogram templates with built-in uncertainties. This approach consists of statistical and ML-enhanced variants, with the latter using a transformer encoder for improved adaptability. Experimental validations on synthetic datasets demonstrate high accuracy, adaptability, and interpretability of both methods. The ML version generally outperforms the statistical variant. Additionally, preliminary results with real LHCb offline DQM data are presented. Second, I discuss NeuroMCT (Neural Monte Carlo Tuning), a framework that employs simulation-based inference to address the likelihood intractability problem for parameter tuning of JUNO's energy response model. The method uses neural likelihood estimation with conditional normalizing flows and a transformer-based regressor integrated with Bayesian nested sampling. This enables robust parameter inference in a highly correlated parameter space. Through systematic uncertainty estimation analysis across thousand parameter combinations and varying statistical exposures, I demonstrate that NeuroMCT provides unbiased parameter estimation with uncertainties limited only by data statistics. Third, I present an aggregated features approach that transforms high-dimensional channel-wise information into a subset of carefully engineered, physics-motivated features. Using the optimized subset of aggregated features as input to Boosted Decision Trees (BDT) and Fully Connected Neural Network (FCNN), both models achieve energy resolution better than 3%/ √E[MeV], meeting JUNO's requirements for neutrino mass ordering determination. Thus, this approach achieves dimensionality reduction while maintaining performance comparable to complex channel-wise methods. Cross-detector transferability is also demonstrated using JUNO's satellite detector TAO. Fourth, I describe an interpretable ML approach for event selection in JUNO, using BDT and FCNN. These models improve selection efficiency by around 10% compared to traditional selection when using the full detector volume, effectively removing the need for strict fiducial cuts. I demonstrate interpretability analysis with Shapley additive explanations methodology and model calibration for both models, and uncertainty quantification using Monte Carlo dropout for FCNN. Finally, I show that ML selection provides several percent improvements in oscillation parameter sensitivity under nominal accidental rates in JUNO and maintains robustness against 100 times higher background rates, offering crucial operational margins.

Enhancing Experimental Particle Physics with Machine Learning Techniques: Applications to JUNO and LHCb / Gavrikov, Arsenii. - (2025 Dec 16).

Enhancing Experimental Particle Physics with Machine Learning Techniques: Applications to JUNO and LHCb

Gavrikov, Arsenii
2025

Abstract

The experimental particle physics data analysis pipeline represents a complex sequence of stages, from raw data acquisition to final physics results. Each stage presents numerous challenges that define the scientific output of every experiment, including those that are the focus of this thesis: the Jiangmen Underground Neutrino Observatory (JUNO) and the Large Hadron Collider beauty (LHCb) experiment. These challenges can be experiment-specific, field-specific, or universal, such as those related to Data Quality Monitoring (DQM) systems. As particle physics experiments grow in scale and complexity, traditional analysis methods increasingly struggle to efficiently extract the full value from the rich datasets the experiments generate. Machine learning (ML) has proven to be a powerful toolset for addressing these diverse challenges, offering novel approaches that can significantly enhance our current data analysis pipelines. This thesis comprises several interconnected studies demonstrating how ML techniques can be adopted across the following stages of the analysis pipeline: (i) DQM with LHCb; and with JUNO: (ii) detector response understanding and Monte Carlo tuning, (iii) event reconstruction, and (iv) event selection. First, I introduce DINAMO (Dynamic and Interpretable Anomaly Monitoring), a novel experiment-agnostic framework that automates DQM by constructing evolving histogram templates with built-in uncertainties. This approach consists of statistical and ML-enhanced variants, with the latter using a transformer encoder for improved adaptability. Experimental validations on synthetic datasets demonstrate high accuracy, adaptability, and interpretability of both methods. The ML version generally outperforms the statistical variant. Additionally, preliminary results with real LHCb offline DQM data are presented. Second, I discuss NeuroMCT (Neural Monte Carlo Tuning), a framework that employs simulation-based inference to address the likelihood intractability problem for parameter tuning of JUNO's energy response model. The method uses neural likelihood estimation with conditional normalizing flows and a transformer-based regressor integrated with Bayesian nested sampling. This enables robust parameter inference in a highly correlated parameter space. Through systematic uncertainty estimation analysis across thousand parameter combinations and varying statistical exposures, I demonstrate that NeuroMCT provides unbiased parameter estimation with uncertainties limited only by data statistics. Third, I present an aggregated features approach that transforms high-dimensional channel-wise information into a subset of carefully engineered, physics-motivated features. Using the optimized subset of aggregated features as input to Boosted Decision Trees (BDT) and Fully Connected Neural Network (FCNN), both models achieve energy resolution better than 3%/ √E[MeV], meeting JUNO's requirements for neutrino mass ordering determination. Thus, this approach achieves dimensionality reduction while maintaining performance comparable to complex channel-wise methods. Cross-detector transferability is also demonstrated using JUNO's satellite detector TAO. Fourth, I describe an interpretable ML approach for event selection in JUNO, using BDT and FCNN. These models improve selection efficiency by around 10% compared to traditional selection when using the full detector volume, effectively removing the need for strict fiducial cuts. I demonstrate interpretability analysis with Shapley additive explanations methodology and model calibration for both models, and uncertainty quantification using Monte Carlo dropout for FCNN. Finally, I show that ML selection provides several percent improvements in oscillation parameter sensitivity under nominal accidental rates in JUNO and maintains robustness against 100 times higher background rates, offering crucial operational margins.
Enhancing Experimental Particle Physics with Machine Learning Techniques: Applications to JUNO and LHCb
16-dic-2025
Enhancing Experimental Particle Physics with Machine Learning Techniques: Applications to JUNO and LHCb / Gavrikov, Arsenii. - (2025 Dec 16).
File in questo prodotto:
File Dimensione Formato  
final_thesis_Arsenii_Gavrikov.pdf

accesso aperto

Descrizione: Final thesis Arsenii Gavrikov
Tipologia: Tesi di dottorato
Dimensione 28.66 MB
Formato Adobe PDF
28.66 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3572594
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact