Bug: data.py & XML catalogue generated (for e.g. Ganga jobs) does not specify MDF type, IOHelper fails to deduce type
The main output of HLT2 that ran in 2022 is of the MDF type (saved as files with a .raw extension). However, it seems that the data.py and the XML catalogue do not take this into account, such that jobs fail with the mention that the file could not be opened:
TFile::TFile ERROR file /pnfs/in2p3.fr/data/lhcb/LHCb-Disk/lhcb/buffer/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090013_0088.raw does not exist
IODataManager ERROR Error: connectDataIO> Cannot connect to database: PFN=mdf:root://proxy@ccxrootdlhcb.in2p3.fr//pnfs/in2p3.fr/data/lhcb/LHCb-Disk/lhcb/buffer/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090013_0088.raw FID=d1ea7125-52c6-4d2d-97f1-a5df7f2e24d2
IODataManager ERROR Failed to open dsn:d1ea7125-52c6-4d2d-97f1-a5df7f2e24d2 Federated file could not be resolved from 1 entries.
EventSelector INFO Stream:EventSelector.DataStreamTool_2 Def:DATAFILE='LFN:/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090016_0092.raw' SVC='Gaudi::RootEvtSelector' OPT='READ'
TFile::TFile ERROR file /pnfs/in2p3.fr/data/lhcb/LHCb-Disk/lhcb/buffer/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090016_0092.raw does not exist
IODataManager ERROR Error: connectDataIO> Cannot connect to database: PFN=mdf:root://proxy@ccxrootdlhcb.in2p3.fr//pnfs/in2p3.fr/data/lhcb/LHCb-Disk/lhcb/buffer/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090016_0092.raw FID=ed42bd55-cb56-41dd-bbae-f521f17209f8
(from the example job 730743179, thanks to Peilian Li for spotting this)
To be clear, one can access and download the file just fine with dirac-dms-get-file
, and run over the file locally.
Looking closer at the sandbox created, I suspect there is a problem in the preparation of the data.py
file by CreateDataFile.py
, as locally I could reproduce and fix this, by changing the IOHelper()
to IOHelper("MDF")
:
IOHelper("MDF").inputFiles([
'LFN:/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090013_0088.raw',
'LFN:/lhcb/data/2022/RAW/PASSTHROUGH/LHCb/COLLISION22/256289/256289_00090016_0092.raw',
], clear=True)
I also found a reference to "ROOT_All" int he XML catalogue created, and I changed that to "MDF" to be sure, but I don't think it's actually used and I could run with just the change in the IOHelper call.
I suspect something needs to be added that checks whether the given file is 'raw'/"RAW" or "mdf"/"MDF" in the IOHelper, or the CreateDataFile can add the type explicitly for the IOHelper.