Skip to content

WIP:21.0 data overlay conditions access updates

Attempting to extract and update folder overrides.

This is an attempt to fix the issue with conditions access in data
overlay. We need to grab the run number being used for the first event
of the overlay, establish which period it's in, and then set the
conditions loading to begin from a good lumiblock in that period. This
is done for all folders that are being loaded, so that they all have an
equal starting point (this should also be very cheap for jobs, as the
corresponding queries will all hit the cache).

After that, the jobs will load the appropriate conditions for their
precise run, but for many of the conditions folders the IOVs that have
been returned by the conditions service will still be applicable, so we
won't generate an additional query.

Retaining also the infrastructure that sets long conditions period loads
as established by past attempts to improve these problems.

Applying this for 2018 HI and 2016-2018 pp to start. If successful, it
will be trivial to extend with the help of the data quality group.

I'm marking this a WIP for now so that @jchapman , @mduehrss , @tkharlam , @olszewsk , @cyoung and @ahaas can look it over and see what they think. There is also an open question about how to treat two folders: /TILE/OFL02/PULSESHAPE/CIS/PULSE100 and /TILE/OFL02/PULSESHAPE/CIS/PULSE5P2. I've tried running the 2015 overlay nightly test, and I do see a crash with this setup from conditions loading:

EVNTtoHITS 02:17:49 AtlasFieldSvc       ERROR Missing solenoid current in DCS information
EVNTtoHITS 02:17:49 AtlasFieldSvc       ERROR Missing toroid current in DCS information
EVNTtoHITS 02:17:49 IOVSvcTool          ERROR Problems calling MagField::AtlasFieldSvc[0x2ecb4800]+7f44a3ccb400
EVNTtoHITS 02:17:49 IOVSvcTool          ERROR Problems preloading IOVRanges
EVNTtoHITS 02:17:49 IncidentSvc         ERROR Standard std::exception is caught handling incident0x7ffcf0f71b98
EVNTtoHITS 02:17:49 IncidentSvc         ERROR IOVSvcTool::preLoadProxies
EVNTtoHITS 02:17:50 Traceback (most recent call last):
EVNTtoHITS 02:17:50   File "/cvmfs/atlas-nightlies.cern.ch/repo/sw/21.0/2019-10-15T2147/Athena/21.0.103/InstallArea/x86_64-slc6-gcc62-opt/jobOptions/AthenaCommon/runbatch.py", line 18, in <module>
EVNTtoHITS 02:17:50     theApp.run()     # runs until theApp.EvtMax events reached
EVNTtoHITS 02:17:50   File "/cvmfs/atlas-nightlies.cern.ch/repo/sw/21.0/2019-10-15T2147/Athena/21.0.103/InstallArea/x86_64-slc6-gcc62-opt/python/AthenaCommon/AppMgr.py", line 663, in run
EVNTtoHITS 02:17:50     sc = self.getHandle()._evtpro.executeRun( nEvt )
EVNTtoHITS 02:17:50 Exception: StatusCode IEventProcessor::executeRun(int maxevt) =>
EVNTtoHITS 02:17:50     std::exception (C++ exception of type exception)

This smells like an issue with what I'm doing, of course -- suggestions for debugging are quite welcome!

Along the way, I did some clean up and tried to move to logging, for example. I hope this is a little prettier, at least.

Merge request reports