Skip to content

Transient XRootD/storage access issue

When running bamboo workflows over many central files (particularly on HTCondor), transient XRootD access failures (~1% of files) happens. This one is also related: #113

Failures can occur:

  • During TChain initialization (causing crashes)
  • During event processing (causing silent skips, bug)

Example error:

INFO:bamboo.workflow:Starting to fill plots (and skims) 
Error in TNetXNGFile::Open: \[ERROR\] Server responded with an error: \[3000\] Unable to open - cannot determine the prefix path to use for the given filesystem id /store/mc/Run3Summer22NanoAODv12/GluGlutoHHto2B2WtoLNu2Q_kl-2p45_kt-1p00_c2-0p00_LHEweights_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_v5-v2/2820000/603a1208-5558-42ab-92a1-0712b242473d.root; invalid argument 
INFO:bamboo.workflow:Plots finished in 32.53s, max RSS: 1061.02MB. 231 histograms, 0 skims
  • Location: lxplus + HTCondor
  • Affects both local and distributed runs
  • Not specific to particular files/datasets
  • Not commonly reported by Slurm users

Issue reported by @scrossle.

Edited by Khawla Jaffel
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information