Skip to content

Fix for race conditions in HTCHandler

This MR proposes a solution for the issues mentioned in #17 (closed) I tested it with a PPS workflow which executes the PPS AlCaReco step on top of RECO files at AOD data tier. For details to replicate the test, please refer to the PPS automation docs.

After the change, calling resubmit right after the submit command doesn't cause idle jobs to be resubmitted, and the resubmit command properly submits the failed jobs once more. This MR also enforces max_retries=0 in all .sub files, which can create a race-condition as pointed out in the issue comment

Please discard the failed pipeline (the one in the PPS automation-control repository was never properly set up).

Merge request reports