Draft: ComponentAccumulator: implement async merges (!66794) · Merge requests · atlas / athena

Frank Winklmeier requested to merge fwinkl/athena:ca_async into main Oct 27, 2023

Proof-of-concept (not proposing to merge this in its current form) to implement asynchronous merging in ComponentAccumulator, e.g.:

acc = ComponentAccumulator()
acc.async_merge(MyCfg, flags)
acc.async_merge(MyOtherCfg, flags)
...
acc.async_wait()

Changed RecoSteering to apply this for the various reco domains. Results in speedup from 2:00 (91% CPU) minutes to 1:35 (136%) with a process Pool of size 4. This means there is unfortunately not enough parallelism to even fully utilize two cores.

Notes:

Requires the multiprocess module (available in LCG) which uses dill for object serialization (the default pickle cannot handle the lambas in AthConfigFlags)
Apart from differences in the order of some property values the final config pkl is the same
By splitting the configuration into sub-processes we loose potential optimizations from the AccumulatorCache and cache statics are not reported back to the main process
Cannot be applied "blindly", e.g. post-processing still needs to run last
Process pool size would need to be made configurable
Log output is messy

To reproduce one needs to setup the full LCG release:

export LCG_RELEASE_BASE=/cvmfs/sft.cern.ch/lcg/releases
asetup --noLcgReleaseBase Athena,main,latest

Edited Oct 27, 2023 by Frank Winklmeier

Draft: ComponentAccumulator: implement async merges

Merge request reports