Skip to content

Add a --distributed=finalize mode that checks the outputs and performs merging if necessary

Pieter David requested to merge piedavid/bamboo:distributed_finalize into master

Based on the feedback and script by @fbury .

Since I was editing this part of the code I couldn't resist doing a bit of refactoring/simplification (I left a set of arguments being passed on three levels, and ad-hoc tuple-of-tuple types that are now gone / got a name) which ended up being a good part of the code changes, but the interesting changes are:

  • --onlypost checks that the files for all samples are there in <output>/results before doing anything else (otherwise it would fail a bit later when getting files from a ROOT file or so)
  • --distributed=finalize can be run when all the outputs of failing jobs are there in the right directory (e.g. after a resubmit), it will do the merging/moving and postprocess (it's done such that it should never overwrite a file; if a file is found in the results directory the sample is assumed to be fine, it should be removed to trigger a re-merge). If any files are missing it lists them and exits before postprocessing.
  • no DAS queries for samples with eras that are not considered
  • the monitoring printout now also lists the IDs of the jobs that failed, to make it easier to fix the problem and resubmit while the rest are still running (monitoring and book-keeping that is left up to the user)

If anyone was using the first argument to postprocess (taskList): this changes it from [ (([inputFiles], outputFile), {optional arguments}) ] to a list of SampleTask instances - it has the same (a bit more, actually) information as before, but in case we should check if it's worth the change and coordinate.

Merge request reports