Skip to content

Crash observed in FastJetClusterSequence running REC_v36r10p1 in production

Crash observed processing run 298980 in HLT23333. Possibly due to FastJetClusterSequence. Should be reproducible running on MDFs under /hlt2/errors/LHCb2/0000298980/, merged into /hlt2/errors/LHCb2/0000298980/errors_298980_merged.mdf

Level 3:

Jun 25, 2024 @ 04:22:40.625	[INFO] Process: 'LHCb2_HLT23333_HLT2_0' (SignalHandler) 03 --> 0x7fff90c6f1fd
Jun 25, 2024 @ 04:22:40.000	/cvmfs/lhcb.cern.ch/lib/lcg/releases/fastjet/3.4.1-5af57/x86_64-el9-gcc13-
opt/lib/libfastjet.so.0(_ZN7fastjet15ClusterSequence24_faster_tiled_N2_clusterEv+0x56d)[0x7fff90c6f1fd]

Level 4:

Jun 25, 2024 @ 04:22:40.625	[INFO] Process: 'LHCb2_HLT23333_HLT2_0' (SignalHandler) 04 --> 0x7fff90ddf93d
Jun 25, 2024 @ 04:22:40.000	/cvmfs/lhcb.cern.ch/lib/lhcb/REC/REC_v36r10p1/InstallArea/x86_64_v3-el9-gcc13-
opt+g/lib/libJetAccessories.so(_ZN7fastjet15ClusterSequenceC1INS_9PseudoJetEEERKSt6vectorIT_SaIS4_EERKNS_13JetDefinitionERKb+0x46d)[0x7fff90ddf93d]

Asking gdb:

[cmarinbe@n8190601 ~]$ /group/hlt/hlt2/stack_RTA_2024_06_13_1/MooreOnline/gdb /cvmfs/lhcb.cern.ch/lib/lhcb/REC/REC_v36r10p1/InstallArea/x86_64_v3-el9-gcc13-
opt+g/lib/libJetAccessories.so 

(gdb) list *(_ZN7fastjet15ClusterSequenceC1INS_9PseudoJetEEERKSt6vectorIT_SaIS4_EERKNS_13JetDefinitionERKb+0x46d)

points to:

0x12393d is in fastjet::ClusterSequence::ClusterSequence<fastjet::PseudoJet>(std::vector<fastjet::PseudoJet, std::allocator<fastjet::PseudoJet> > const&, fastjet::JetDefinition 
const&, bool const&) (/cvmfs/lhcb.cern.ch/lib/lcg/releases/fastjet/3.4.1-5af57/x86_64-el9-gcc13-opt/include/fastjet/ClusterSequence.hh:1029).
1024	  // transfer the remaining options
1025	  _decant_options_partial();
1026	
1027	  // run the clustering
1028	  _initialise_and_run_no_decant();
1029	}
1030	
1031	
1032	inline const std::vector<PseudoJet> & ClusterSequence::jets () const {
1033	  return _jets;

Full stack trace in https://lblogbook.cern.ch/HLT/1090