Lamarr Training and Multi-Platform Compatibility

Summary

In July 2025, while introducing GCC 13 as the compiler for Gauss, tests related to Lamarr began failing. The errors stem from the use of math functions in precompiled C objects whose linking symbols are no longer available in the standard C libraries bundled with GCC 13. This change is likely tied to the evolution of fast math optimizations and the deprecation of legacy symbol naming in GLIBC.

Specifically, the issue traces back to the removal of math-finite.h in GLIBC around 2019, which deprecated finite-only variants of math functions such as __exp_finite. These functions were historically used in conjunction with -ffast-math but were considered non-standard and fragile. GCC 13 enforces stricter linking behavior, exposing legacy dependencies that were previously tolerated.

Concurrently, Gauss maintainers began including ARM architecture in nightly builds. Since Lamarr models are not yet available for ARM, tests on this platform also fail.

Design Choice Behind the Errors

Lamarr models are transpiled using scikinC and distributed as shared objects via CVMFS. Treating compiled software as data introduces portability risks, which were mitigated by minimizing dependencies and relying solely on standard C (excluding C++).

The only external dependency is the standard C math library, used for computing neural network activation functions. No mitigation was in place for compiling on architectures beyond x86, and the evolution of GLIBC linking symbols was not anticipated.

Should We Move Transpiled C Code to Git?

While technically feasible, storing transpiled C code in Git is impractical due to its size (~50 MB per model) and its decoupled lifecycle from the main software. Machine learning models evolve independently and are selected at runtime within Gauss.

However, to improve portability, we should introduce support for multiple architectures, following the platform naming conventions adopted by LHCb. Even if GitLab is not ideal for storing transpiled code, it should still be versioned as the most architecture-independent artifact produced during model training.

GitHub Actions running on CINECA Leonardo Booster

To leverage GPU time allocated for Lamarr development on the CINECA supercomputer Leonardo, we deployed a CI/CD pipeline using GitHub Actions. This pipeline performs:

Preprocessing of ROOT files from Bender containing the training dataset
Training of machine learning models with GPU acceleration
Transpilation and compilation of models for the Leonardo architecture
Statistical validation by comparing transpiled output with training data
Compilation of transpiled models for multiple LHCb platforms from CVMFS:
- x86_64_v2-centos7-gcc9-opt
- x86_64_v2-el9-gcc13-opt
- x86_64_v2-el9-gcc13-dbg
- armv8.1_a-el9-gcc13-opt
Storage of training reports, validation results, transpiled C code, and compiled shared objects as GitHub releases

The upload to LHCb data packages (and subsequent publication on CVMFS) is performed manually for safety and security reasons.

Models Deployed via This Workflow

Tracking models: ... on track ...
PID models: pp-2016-MU-Sim10b-gha-2025-08-28T12h33m00

During workflow development, training time was reduced to accelerate debugging. As a result, model quality may be lower than previous releases. However, the models remain fully functional and suitable for validation in LHCb nightlies.

Extended training and renewed focus on statistical performance will follow integration testing.

Integration of the models in Lamarr

The Lamarr branch in the Gauss project is relatively outdated, as recent efforts have focused on integration with Gaussino via SQLamarr. Consequently, structural changes in the models were not reflected in the Lamarr Gaudi Algorithm.

Although the conceptual model remains unchanged, preprocessing and pipelining steps have increasingly shifted into the models themselves, simplifying the interface and making them more self-contained.

To integrate the new models into Lamarr, updates are required not only in the LamarrData package and configuration, but also in the C++ code of LamarrPropagator and LamarrTraining.

A new branch pointing to master will be created.

Roll-out checklist

These are the steps of the strategy we plan to follow to fix the problem in the nightlies.

Push the new models in a new branch of LamarrData
Open a Merge Request to update the master branch with the new models
Merge LamarrData and tag it as v4
Create a new branch of Gauss named landerli_lamarr_gcc13 with the modifications necessary for package Sim/LbLamarr
Open a Merge Request
include the MR in nightlies of the slot lhcb-sim10-dev
Check nightly outcome
Merge and enjoy

Edited Sep 03, 2025 by Gloria Corti