Draft: probNN for Run3 v1.0
Initial version for probNNs, includes definitions, architectures and tuple definition for training.
-- 30/04 --
Updated version, includes optimized version of architectures (genetic approach) as method for reducing sizes from Run2 probNN (50 inputs with 2 hidden layers of 50-60 neurons) up to these architectures around 1k parameters, reducing sizes around 6 times.
To train them, see pidkachu_tuples_train that contains the selection and training basis for MLP. In the upcoming days, an upgraded version for training will be pushed, including a scheduling method for the learning rate for pushing further the training capabilities and find better weights. It'll also contain an early-stopper method to avoid overfitting, improving the whole training phase and reducing it to few than 50 epochs.
Samples used belong to several decays, using expected-2024 conditions:
- Electrons: B->JpsiK, ee
- Muons: B->JpsiK, mm ++ inclusive Jpsi, mm
- Pions: Ks->pipi (low momentum) ++ Bu->Kpi (high momentum)
- Kaons: D*->D0pi, Kpi (all range) + D*->D0pi, KK (to enlarge sample)
- Protons: Lc->pKpi, Lb->ppi
Training done balancing samples, used around 150k samples from all decays but protons (100k) to perform studies for variables selection and architecture tests, subsampling to maintain as close as possible a balance between signal/background.
Few things to bear in mind:
- Currently supported by Long tracks, study for other track types will also be performed.
- Based on momenta distribution with the samples used, there's not enough statistics to push the capabilities of the network to its limits. There's a MC request for this purpose that will be updated according the needs to get few samples to properly cover the whole momentum range for all particle types with sufficient statistic per momentum unit (in single GeV step). Some other decays to enrich not only in sample quantity (below from desirable) but also the coverage are more than welcome to be added to the request.
- Network optimization achieved is a partial one mainly to the consideration above together with the computational requirements needed for the genetic approach. Once the whole range is covered, a depth study to push architecture's limit will be performed, together with exploring options for applying iterative prunning within SIMD.
Comments, questions or discussions are more than welcome.