Draft: ComponentAccumulator: implement async merges
Proof-of-concept (not proposing to merge this in its current form) to implement asynchronous merging in ComponentAccumulator
, e.g.:
acc = ComponentAccumulator()
acc.async_merge(MyCfg, flags)
acc.async_merge(MyOtherCfg, flags)
...
acc.async_wait()
Changed RecoSteering
to apply this for the various reco domains. Results in speedup from 2:00 (91% CPU) minutes to 1:35 (136%) with a process Pool of size 4. This means there is unfortunately not enough parallelism to even fully utilize two cores.
Notes:
- Requires the
multiprocess
module (available in LCG) which usesdill
for object serialization (the defaultpickle
cannot handle thelamba
s inAthConfigFlags
) - Apart from differences in the order of some property values the final config pkl is the same
- By splitting the configuration into sub-processes we loose potential optimizations from the
AccumulatorCache
and cache statics are not reported back to the main process - Cannot be applied "blindly", e.g. post-processing still needs to run last
- Process pool size would need to be made configurable
- Log output is messy
To reproduce one needs to setup the full LCG release:
export LCG_RELEASE_BASE=/cvmfs/sft.cern.ch/lcg/releases
asetup --noLcgReleaseBase Athena,main,latest
Merge request reports
Activity
added NewConfig label
Hi Frank, That is very nice.
In the past I rmember that the time was completely dominated by file I/O (but that was when we had configurable.) Is this still the significant part?
Now the 100$ question. Is there are reason why we would need to do this with processes not just with threads?
Edited by Tomasz BoldThere is no I/O bottleneck, we are almost entirely dominated by
merge
anddeepcopy
.We'd have to wait for Python 4 to do this in MT
Currently only one thread can execute Python code at a given time due to the GIL. Exceptions would be calls to C-libraries but that's not the case here.Edited by Frank Winklmeier
- Resolved by Frank Winklmeier
Following the creation of the 24.0 branch from main, you should now decide whether this MR should target 24.0 or main, according to these guidelines agreed in the Software Weekly meeting: https://indico.cern.ch/event/1382755/attachments/2802320/4889268/BranchingGuidelines24.pdf If you decide that this MR should target 24.0, please re-direct it by editing and changing the target branch in the drop-down menu. If it should stay in main, please indicate this as a reply to this message. Remember that all MRs going into 24.0 will also be swept into main.