Skip to content

Add --distributed=parallel (RunGraphs)

Pieter David requested to merge piedavid/bamboo:parallel_mode into master

Based on https://gitlab.cern.ch/cp3-cms/bamboo/-/merge_requests/203 (hence the WIP status), which is was waiting for !195 (merged) .

changes of only this PR

This adds a --distributed=parallel mode, which splits plot definition, graph building, and event loop running in separate steps, such that ROOT::RDF::RunGraphs can be used to process samples in parallel (if implicit multithreading is used) - this is already useful by itself, and also needed for using distributed RDF, making "incremental runs" convenient, compiling workers of the fully compiled backend in parallel etc.

There is one thing that I noticed will also be useful for those: reusing corrections and calculators across samples (they are thread-safe) - currently we make them all specific for a sample in various ways (the tricky case are those that are not a one-line definition, but configured with function calls, like the jet&met calculators; for the rest it should be a matter of playing with the uname argument comment string).

Edited by Pieter David

Merge request reports