Skip to content
Snippets Groups Projects

Synchronize workflow tests with Athena

Open Paul Gessinger requested to merge sync-test-jobs into main
2 unresolved threads

I'm trying to synchronize the workflow tests that we run with what's in https://gitlab.cern.ch/atlas/athena/-/blob/main/AtlasTest/CITest/Athena.cmake?ref_type=heads, which in general makes it easier to track what the correct setup would be.

Thoughts @tsulaia @jojungge @boudreau ?

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
104 104 extends: .run_base
105 105 script:
106 106 - cd run
107 - RunWorkflowTests_Run2.py --CI -s -w FullSim --threads 4 -e '--maxEvents 10' --detailed-comparison
107 - RunWorkflowTests_Run2.py --CI -r -w MCReco --threads 0 -e '--maxEvents 25'
  • I'd keep at least one thread

  • The Athena CI runs --threads 0, which I guess is serial athena?

  • Yes, --threads 0 runs serial athena. I think we run serial Athena here because it's a Run2 configuration. For Run3 we run only in the MT mode (threads 1 or larger).

    I have another question here: I think the original idea here was to run a full simulation job, and this change drops the simulation test and runs digi+reco. I'm totally fine with running digi+reco but why are we dropping the simulation? I remember we had some cases in the past when GeoModel changes broke full simulation tests, so I thought running these tests as part of our pipeline was a good idea.

  • It's just a decision we need to make. My thinking is that we pick something from the set of CI jobs that the Athena CI runs. If there's a more suitable job like that, that's a better choice.

    I just want to avoid having to manually figure out when stuff breaks due to reasons unrelated to GM (which it does frequently), and ideally just copy over the new command that's known to run in the Athena CI.

  • We should focus here more on monitoring simulation and maybe add the reco tests, if they aren't already part of the pipeline ;).

  • Sure, if you're willing to debug any unrelated failure in FullSim, let's do it.

  • Please register or sign in to reply
  • Johannes Junggeburth resolved all threads

    resolved all threads

  • added 2 commits

    • bdf1bdee - 1 commit from branch main
    • 2f5aa280 - (try to) synchronize workflow tests with Athena

    Compare with previous version

  • 119 119 extends: .run_base
    120 120 script:
    121 121 - cd run
    122 - RunWorkflowTests_Run3.py --CI -s -w FullSim --threads 4 -e '--maxEvents 20' --run-only
    122 - RunWorkflowTests_Run3.py --CI -r -w MCReco -e '--maxEvents 25 --conditionsTag OFLCOND-MC23-SDR-RUN3-05'
    • I have a similar question to the one I posted: why are we dropping the simulation test?

      Well, here we have another story. This test started to fail recently and we don't know why. That's why I temporarily replaced the detailed comparison mode with run-only. Instead of dropping the simulation test, we should try to understand why it started to fail and fix it.

      What do you think? Thanks.

    • That was exactly the reason I was suggesting this. If we don't have the bandwidth to figure out what the underlying failures of custom job configurations are, it might be safer to pick something from the set of jobs that Athena CI runs anyway, to shield us from unrelated failures, and pick up only on the ones introduced by GM.

    • I'm fully on board with the idea of sticking to Athena CI jobs. However, there are a couple of full simulation tests - Run2 and Run3 configurations - which run in the AthSimulation CI suite but not in the Athena one. I think having these tests is essential to monitor that GeoModel changes don't break Simulation in Athena/AthSimulation (that has happened to us a few times in the past).

      I plan to spend some time understanding why the run3 configuration works in AthSimulation but breaks in Athena

    • Ok, then let's keep both jobs.

    • Please register or sign in to reply
    Please register or sign in to reply
    Loading