Potential bank corruption in the Sprucing of Lead23

In the Sprucing of Lead23 (in production) we are seeing a small number of errors such as the following

EventSelector                       SUCCESS Reading Event record 90001. Record number within stream 1: 90001
BankCheck    ERROR   Bad magic pattern in Tell1 bank 0x9a8c25c: Size:65512 Type:240:UNKNOWN Source:-4096 Vsn:255 ff
BankCheck    ERROR   Previous (good) bank [0x9a7f35c]: Size:52981 Type:60:DstData Source:512 Vsn: 3 cbcb
LHCb::RawDataCnvSvc                   ERROR Exception:Error decoding raw banks!
BankCheck    ERROR   Bad magic pattern in Tell1 bank 0x9a8c25c: Size:65512 Type:240:UNKNOWN Source:-4096 Vsn:255 ff
BankCheck    ERROR   Previous (good) bank [0x9a7f35c]: Size:52981 Type:60:DstData Source:512 Vsn: 3 cbcb
LHCb::RawDataCnvSvc                   ERROR Exception:Error decoding raw banks!

see more info on the request issue here.

  • This causes the application to fail and the file cannot be processed.
  • This affects less than 0.01% of the input files but we should understand the cause and also why it was not seen at HLT2 - Maybe this points to a HLT2 writing issue?
  • Note this is passthrough Sprucing so the banks are just propagated through. The Dstbank for instance is not even unpacked

The log book entry gives more details https://lblogbook.cern.ch/Operations/37672

and this can be tested locally using

lb-run Moore/v54r22p1 lbexec Hlt2Conf.sprucing_settings.Sprucing_PbPb_2023_1_production:pass_ionraw_production Hlt/Hlt2Conf/options/sprucing/lbexec_yamls/pass_spruce_PbPb_ionraw_2023.yaml

changing the input file in a local copy of pass_spruce_PbPb_ionraw_2023.yaml to one given in the log book report.

As far as I can tell the this is happening only on IONRAW but happens for both RICH and RICHless data...

cc @sesen, @sstahl, @decianm, @cmarinbe, @erodrigu

Edited by Nicole Skidmore