Draft: auto flush `key_registry` to git repo
-
make sure new decodings are guaranteed to be flushed to git (unless explicitly opted out):
previously, an explicit
flush_to_git
had to be called to make sure newly generated decoding keys were written to git -- with a warning in the__del__
of the key registry if that was not done. Note that only a warning was produced, as writing to git from__del__
was not possible, as was called when python was shutting down. So instead, this functionality has been moved to an 'exit handler' which is registered with theatexit
module.note: there is a flag in the options to explicitly opt-out of writing to git which is still (of course) respected -- but if doing would lead to a total loss of keys (i.e. when they are also not written to the output manifest) a warning is printed.
Merge request reports
Activity
added RTA label
mentioned in merge request Allen!986 (merged)
added 1 commit
- 6c6f08ee - Implement auto-flushing of keys to git, with explicit opt-out
added bug fix label
Started integration test build. Once done, check the results or the comparison to a reference build.
Throughput Test Moore_hlt2_fastest_reco: 492.8 Events/s -- change of -0.06% vs. reference
Throughput Test Moore_hlt2_pp_thor: 256.5 Events/s -- change of 0.17% vs. reference
Throughput Test Moore_spruce_all_lines: 199.2 Events/s -- change of -1.05% vs. reference
Throughput Test Moore_hlt1_pp_default: 26717.7 Events/s -- change of 0.45% vs. reference
There are still many warnings like this one test_spruce_all_lines_realtime:
124 - P2VRelationUnpacker#1 WARNING Source location /Event/HLT2/CombineParticles/Particles not persisted, skip the relation. 125 - P2VRelationUnpacker#1 WARNING Source location /Event/HLT2/FunctionalParticleMaker/Particles not persisted, skip the relation.
Sorry, there is one additional error (https://lhcb-nightlies.web.cern.ch/nightly/lhcb-master-mr/5658/Moore/x86_64_v2-centos7-gcc11-opt/tests#Hlt2Conf.test_hlt2_with_hlt1_decisions) which is linked to the
atexit
fucntion:Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/cvmfs/lhcb.cern.ch/lib/lcg/releases/Python/3.9.6-b0f98/x86_64-centos7-gcc11-opt/lib/python3.9/subprocess.py", line 951, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/cvmfs/lhcb.cern.ch/lib/lcg/releases/Python/3.9.6-b0f98/x86_64-centos7-gcc11-opt/lib/python3.9/subprocess.py", line 1754, in _execute_child self.pid = _posixsubprocess.fork_exec( OSError: [Errno 12] Cannot allocate memory
hmm.... I suspect that I know what is happening:
atexit
runs after the job has finished processing, so Moore has run, allocated lots of memory, and then subprocess has to fork -- which doubles the (virtual) memory requirement prior to executing a new process. But I really want to run the flushing not that late, just at the end of the configuration stage, prior to starting up Gaudi.... not just because of memory consumption, but also because the event processing may crash, which then would skip the registered at exit functions. And I want to make sure things are flushed prior to processing the event loopSo I need a hook at the point where PyConf is done, and the control is handed back from it... Or I fall back (for now) to the solution in Allen!986 (merged) while investigating a better one.... which implies I take out the atexit handler here, leave the other fix, and re-open Allen!986 (merged) and continue thinking about a more elegant solution.
added ci-test-triggered label
- [2022-09-15 14:09] Validation started with lhcb-master-mr#5658
- Resolved by Gerhard Raven
I don't understand the motivation for this change. Could you elaborate please?
(also, doing things at
atexit
is too much magic)
added 1 commit
- 8d03e7c3 - make manifest creation independent of the state of the local repository
added 1 commit
- 89d0bfe4 - make manifest creation independent of the state of the local repository
@sesen, @gpietrzy : FYI -- this should now (also) fix the problem of only the first Moore job creating a manifest with decoding keys, and subsequent ones deciding not to add them.
Edited by Gerhard Raven@sesen, @gpietrzy: to disentangle MRs, the fix of only the first job adding keys to the manifest has been split into its own MR: !3777 (merged). The other purpose of this MR, namely guaranteeing that the keys were flushed to git in Allen has been superseded by Allen!986 (merged)... (which this MR originally tried to supersede).
So closing this merge request in favor of !3777 (merged) and Allen!986 (merged).
mentioned in issue Moore#478 (closed)