Synchronize workflow tests with Athena
I'm trying to synchronize the workflow tests that we run with what's in https://gitlab.cern.ch/atlas/athena/-/blob/main/AtlasTest/CITest/Athena.cmake?ref_type=heads, which in general makes it easier to track what the correct setup would be.
Merge request reports
Activity
104 104 extends: .run_base 105 105 script: 106 106 - cd run 107 - RunWorkflowTests_Run2.py --CI -s -w FullSim --threads 4 -e '--maxEvents 10' --detailed-comparison 107 - RunWorkflowTests_Run2.py --CI -r -w MCReco --threads 0 -e '--maxEvents 25' Yes,
--threads 0
runs serial athena. I think we run serial Athena here because it's a Run2 configuration. For Run3 we run only in the MT mode (threads 1 or larger).I have another question here: I think the original idea here was to run a full simulation job, and this change drops the simulation test and runs digi+reco. I'm totally fine with running digi+reco but why are we dropping the simulation? I remember we had some cases in the past when GeoModel changes broke full simulation tests, so I thought running these tests as part of our pipeline was a good idea.
It's just a decision we need to make. My thinking is that we pick something from the set of CI jobs that the Athena CI runs. If there's a more suitable job like that, that's a better choice.
I just want to avoid having to manually figure out when stuff breaks due to reasons unrelated to GM (which it does frequently), and ideally just copy over the new command that's known to run in the Athena CI.
added 2 commits
119 119 extends: .run_base 120 120 script: 121 121 - cd run 122 - RunWorkflowTests_Run3.py --CI -s -w FullSim --threads 4 -e '--maxEvents 20' --run-only 122 - RunWorkflowTests_Run3.py --CI -r -w MCReco -e '--maxEvents 25 --conditionsTag OFLCOND-MC23-SDR-RUN3-05' I have a similar question to the one I posted: why are we dropping the simulation test?
Well, here we have another story. This test started to fail recently and we don't know why. That's why I temporarily replaced the detailed comparison mode with run-only. Instead of dropping the simulation test, we should try to understand why it started to fail and fix it.
What do you think? Thanks.
That was exactly the reason I was suggesting this. If we don't have the bandwidth to figure out what the underlying failures of custom job configurations are, it might be safer to pick something from the set of jobs that Athena CI runs anyway, to shield us from unrelated failures, and pick up only on the ones introduced by GM.
I'm fully on board with the idea of sticking to Athena CI jobs. However, there are a couple of full simulation tests - Run2 and Run3 configurations - which run in the AthSimulation CI suite but not in the Athena one. I think having these tests is essential to monitor that GeoModel changes don't break Simulation in Athena/AthSimulation (that has happened to us a few times in the past).
I plan to spend some time understanding why the run3 configuration works in AthSimulation but breaks in Athena