Multi-event scheduler
-
Review changes -
-
Download -
Patches
-
Plain diff
A multi-event-scheduler
What's that
This is a scheduler can execute a control and dataflow configuration, similarly to HLTControlFlowMgr in LHCb.
(see https://iopscience.iop.org/article/10.1088/1742-6596/1525/1/012052/pdf if you are not familiar)
Sharing functionality
Much of the functionality that we introduce here is already existent in Moore and the CPU HLT. Specifically how control and dataflow is set up in the configuration: CompositeNodes define control flow constraints and data flow constraints are defined in the background by requiring that a producer of data runs before its consumer. We have ripped out Moore/PyConf functionality for that (MiniPyConf here), specifically the data_flow and components modules.
Overall, Allen configuration looks similar to Moore configuration with this setup.
How does it work
-
From the CompositeNode tree that defines the application, we first extract execution masks for all algorithms. Consider a simple tree: You would like to run algorithms A,B,C in a lazy fashion, with a connecting OR. The masks that the scheduler extracts from this tree are given in the Leafs. B only has to run in case A did not pass, and C only if B and A did not pass.
-
For algorithms that appear multiple times, we merge the execution lists with an ANY relationship
-
we simplify the boolean mask expressions using a solver (sympy.simplify)
-
we gather data and control flow dependencies:
- data dependencies are found by backtracing inputs (PyConf feature)
- control flow dependencies are extracted by parsing the simplified boolean expression (from 3.) back into control flow trees and extracting the algorithms.
-
we order algorithms
- data dependencies serve as constraints
- control flow dependencies serve as soft constraints
- we might find ourselves in a position where data & control flow dependencies cannot all be accounted for, in which case we loosen the control flow constraints by loosening the execution mask for one of the algorithms that is insertable according to data dependencies. Generic mask loosening is done by substituting an algorithm in a mask by True or False and then simplifying again. Example:
(A & B) -> loosen by B -> (A & True) | (A & False) -> A
-
we receive an ordered collection of algorithms with their respective execution masks
-
For every unique, nontrivial execution mask we build a combiner algorithm that is inserted in the sequence right before the first algorithm with that execution mask
-
The allen executable sequence is generated and compiled
-
Execution works as follows:
- Algorithms with execution mask
True
are executed on every event - Execution is governed by event lists: Algorithms execute on every event in the event list that they get as input
- Algorithms that can reduce the event list, like the GEC, export one as output, which is then consumed by algorithms with
GEC
as execution mask - For more complicated masks, like
GEC & BLUB
, the combiner algorithms take care of event list union(OR) / intersection(AND) / inversion(NOT) before the algorithm actually executes
- Algorithms with execution mask
Algorithm order optimizations
With every control flow tree there are multiple possible orderings that the scheduler might consider, and the throughput depend on these orders.
There are two types of order swaps that one might consider: A lazy control flow node that does not require a specifc order is one where swapping orders yields different execution masks for different algorithms. As simple heuristic, one can assume that more expensive algorithms are better associated with sparse execution masks.
Defining how expensive an algorithm is and how sparse an execution mask is, is a highly non-trivial task. In fact, doing so perfectly requires actually running the application in all possible orders, which is something we would like to avoid. Currently, we hardcode educated guesses for the weight of an algorithm execution. Some weights are taken from a profile run. Some other heuristics help in setting weights that result in acceptable sequences, like the fact that data providers should be spread out as far as possible to not create io bottlenecks.
In Summary, trying to model accurate execution weights and average efficiencies for algorithms in this heterogenous architechture seems like a bad idea. Instead, it might make more sense to employ optimization algorithms that operate over possible orderings and test each ordering with a quick benchmark, automatically. We expect this procedure to take a long time to complete, but maybe we don't have to optimize on such a high level for too many configurations. Ideas include genetic algorithms or simulated annealing. (none of these are implemented yet, but thats not the scope of this MR anyway)
Built on top of !393 (merged)
TODO:
-
Some more tests for the python core functionality, specifically cftree_ops.py and event_list_utils.py -
Merge minipyconf and pyconf (done in LHCb!2964 (merged)) -
Cleanup of the physics configurations (part of this MR) -
Check that every host_datatype
is only assigned tohost_datatype
s, and that everydevice_datatype
is only assigned todevice_datatype
s. -
Update documentation. -
Once LHCb!2964 (merged) is merged, change GenerateConfiguration.cmake
to useHEAD
instead of a branch of LHCb.
List of changes
One general remark: this MR changes the code generation steps of Allen, but otherwise it does minimal changes to the rest of the codebase. That means that for the most part, with the exception of the introduction of MASK_
types and the event list intersection, union and inversion
, all headers / sources are not modified.
Here is a list of requirements and changes introduced to Allen as part of this MR:
- Pregenerated sequences are removed. Python 3 and libClang are now requirements.
-
git
is required inSTANDALONE
to be able to fetch PyConf from LHCb. - The option
SEQUENCE_GENERATION
is therefore gone as well, and so is its CI job. - The obsolete
gaudi
configurations and previous configurations are gone. - The following directory structure has been created:
AllenConf
contains an "extension" to PyConf to enable Multi Event Scheduling,sequences
contains all sequences,sequences/definitions
contains definition files used by the sequences, andtests
includes MES checks. - The following configurations exist and can be therefore passed to the cmake
SEQUENCE
option:
|-- sequences
| |-- forward.py
| |-- hlt1_complex_validation.py
| |-- hlt1_pp_default.py
| |-- hlt1_pp_no_gec.py
| |-- hlt1_pp_no_gec_validation.py
| |-- hlt1_pp_non-restricted_UT.py
| |-- hlt1_pp_scifi_v6.py
| |-- hlt1_pp_scifi_v6_validation.py
| |-- hlt1_pp_validation.py
| |-- muon.py
| |-- pv.py
| |-- velo.py
| `-- veloUT.py
-
sequences/definitions
files have been refactored heavily. Files now identify with subdetector reconstructions (eg.velo_reconstruction.py
,ut_reconstruction.py
), there are different files for lines (eg.hlt1_technical_lines.py
,hlt1_muon_lines.py
), validators, persistency, and so on. - A number of python tests have been added to run as part of the Allen CI to test the MES functionality.
- A complex sequence test that runs various instances of reconstruction algorithms has been added.
- Combiner algorithms
event_list_intersection, event_list_union and event_list_inversion
have been added. - All event list arguments have become
MASK_INPUT
orMASK_OUTPUT
. There can be at most a singleMASK_INPUT
and a singleMASK_OUTPUT
parameter per algorithm, which are used internally by the MES and don't need to be configured as part of the algorithms.
Should be merged after LHCb!2964 (merged) (and should be tested with it too).
Merge request reports
- version 14476416853
- version 143ec36859c
- version 142c47acd4a
- version 1412b31897b
- version 140c3cfba54
- version 139d612b807
- version 1388a4d50cd
- version 137b1912b59
- version 1360562bd2e
- version 135f3d0c96f
- version 134be22573b
- version 1336d02dbcc
- version 132cea4a561
- version 131300b5802
- version 130d76fc7bf
- version 1295919b741
- version 128eaf3dee6
- version 12749624cf1
- version 1268862c50a
- version 1257eca25a5
- version 12450f435f2
- version 1235fdcdf99
- version 122b4e4c55a
- version 121c95f9235
- version 120664089be
- version 119a7573402
- version 118f532f64f
- version 11757f44d28
- version 116b79a927a
- version 1159618b3a9
- version 114f064e08c
- version 1134d9df583
- version 112a3d615f2
- version 111e78e2012
- version 110173ad7ca
- version 1090f09ce3d
- version 10825d51bac
- version 10711a37195
- version 1065f00d801
- version 105bb2f77cd
- version 104824ad6df
- version 10389077451
- version 102c7cc24a8
- version 1017315d2a3
- version 100c66908ae
- version 9958bad615
- version 98b06f3302
- version 97d18c563c
- version 9623b95e38
- version 957c9b6073
- version 947c5f43a2
- version 939528b121
- version 92aa81d689
- version 915ad93c7f
- version 9007f668de
- version 89ddaa7ed4
- version 88194024c5
- version 87054c5ffe
- version 863257f296
- version 85dd0cba89
- version 8416c11b52
- version 83779b356e
- version 82b5328ae8
- version 81e1c1417f
- version 8007a5066a
- version 79122e2314
- version 78d0b6363f
- version 77d36485fa
- version 769f24a128
- version 7527387a7c
- version 74840539a3
- version 737a4d04e5
- version 72e5b2a9a3
- version 710e57730b
- version 70a979167d
- version 696d3991c7
- version 68c1523042
- version 673e455c3f
- version 66f34b87d5
- version 6511faacea
- version 64aa7706e9
- version 63243e3af3
- version 62637e2546
- version 61d759a76f
- version 60aa759c63
- version 59f2cd26ac
- version 58b4ceb5fa
- version 574dcc9ec6
- version 56788d4b76
- version 55cc5c4d40
- version 5421ca920a
- version 53c05adf49
- version 52cf14ab1d
- version 51e32aaa99
- version 503f66fb93
- version 4918f4054d
- version 48e7d32ed7
- version 472c327ecf
- version 460a085376
- version 45b3ff182d
- version 44399f8f63
- version 4375443577
- version 42b0db33a5
- version 4175443577
- version 40ac4b9320
- version 3968ca3cb5
- version 380ebb0194
- version 37130b0ef7
- version 36b68b9e19
- version 3560fc818b
- version 3467eb9b1b
- version 3319f63f33
- version 3287fef637
- version 31a3b18651
- version 30ed44d5ca
- version 291ccfb868
- version 28f5858dd8
- version 276cf91ece
- version 2670b45150
- version 25d2667a3b
- version 24d5ca67ee
- version 23d5ca67ee
- version 220022dc17
- version 212e78c773
- version 2073cd5a06
- version 1979d099aa
- version 18b867976d
- version 17e4b11183
- version 16388b3d3b
- version 15f625fa33
- version 14116a9906
- version 1382b42039
- version 12d526366a
- version 116c8d27d1
- version 10029dec0b
- version 945f80454
- version 858b5b71f
- version 790100856
- version 6db10081b
- version 5ad7b02b3
- version 4d77e3c4a
- version 3d760ee67
- version 2ac498b8b
- version 1c3f46cac
- master (base)
- latest version591288a9125 commits,
- version 14476416853124 commits,
- version 143ec36859c123 commits,
- version 142c47acd4a122 commits,
- version 1412b31897b121 commits,
- version 140c3cfba54120 commits,
- version 139d612b807119 commits,
- version 1388a4d50cd118 commits,
- version 137b1912b59117 commits,
- version 1360562bd2e118 commits,
- version 135f3d0c96f117 commits,
- version 134be22573b115 commits,
- version 1336d02dbcc113 commits,
- version 132cea4a561112 commits,
- version 131300b5802111 commits,
- version 130d76fc7bf109 commits,
- version 1295919b741108 commits,
- version 128eaf3dee6106 commits,
- version 12749624cf1105 commits,
- version 1268862c50a104 commits,
- version 1257eca25a5102 commits,
- version 12450f435f2101 commits,
- version 1235fdcdf99100 commits,
- version 122b4e4c55a99 commits,
- version 121c95f9235100 commits,
- version 120664089be98 commits,
- version 119a757340297 commits,
- version 118f532f64f96 commits,
- version 11757f44d2895 commits,
- version 116b79a927a94 commits,
- version 1159618b3a993 commits,
- version 114f064e08c95 commits,
- version 1134d9df58393 commits,
- version 112a3d615f292 commits,
- version 111e78e201291 commits,
- version 110173ad7ca90 commits,
- version 1090f09ce3d91 commits,
- version 10825d51bac90 commits,
- version 10711a3719589 commits,
- version 1065f00d80188 commits,
- version 105bb2f77cd87 commits,
- version 104824ad6df86 commits,
- version 1038907745185 commits,
- version 102c7cc24a884 commits,
- version 1017315d2a383 commits,
- version 100c66908ae82 commits,
- version 9958bad61583 commits,
- version 98b06f330282 commits,
- version 97d18c563c80 commits,
- version 9623b95e3879 commits,
- version 957c9b607378 commits,
- version 947c5f43a277 commits,
- version 939528b12175 commits,
- version 92aa81d68974 commits,
- version 915ad93c7f73 commits,
- version 9007f668de72 commits,
- version 89ddaa7ed471 commits,
- version 88194024c570 commits,
- version 87054c5ffe69 commits,
- version 863257f29668 commits,
- version 85dd0cba8967 commits,
- version 8416c11b5266 commits,
- version 83779b356e65 commits,
- version 82b5328ae863 commits,
- version 81e1c1417f62 commits,
- version 8007a5066a61 commits,
- version 79122e231460 commits,
- version 78d0b6363f59 commits,
- version 77d36485fa58 commits,
- version 769f24a12860 commits,
- version 7527387a7c59 commits,
- version 74840539a358 commits,
- version 737a4d04e557 commits,
- version 72e5b2a9a356 commits,
- version 710e57730b55 commits,
- version 70a979167d54 commits,
- version 696d3991c753 commits,
- version 68c152304252 commits,
- version 673e455c3f51 commits,
- version 66f34b87d550 commits,
- version 6511faacea49 commits,
- version 64aa7706e948 commits,
- version 63243e3af347 commits,
- version 62637e254646 commits,
- version 61d759a76f45 commits,
- version 60aa759c6344 commits,
- version 59f2cd26ac43 commits,
- version 58b4ceb5fa42 commits,
- version 574dcc9ec641 commits,
- version 56788d4b7640 commits,
- version 55cc5c4d4039 commits,
- version 5421ca920a38 commits,
- version 53c05adf4937 commits,
- version 52cf14ab1d36 commits,
- version 51e32aaa9935 commits,
- version 503f66fb9334 commits,
- version 4918f4054d32 commits,
- version 48e7d32ed731 commits,
- version 472c327ecf46 commits,
- version 460a08537645 commits,
- version 45b3ff182d40 commits,
- version 44399f8f6339 commits,
- version 437544357738 commits,
- version 42b0db33a539 commits,
- version 417544357738 commits,
- version 40ac4b932037 commits,
- version 3968ca3cb536 commits,
- version 380ebb019435 commits,
- version 37130b0ef736 commits,
- version 36b68b9e1935 commits,
- version 3560fc818b34 commits,
- version 3467eb9b1b32 commits,
- version 3319f63f3331 commits,
- version 3287fef63730 commits,
- version 31a3b1865129 commits,
- version 30ed44d5ca28 commits,
- version 291ccfb86827 commits,
- version 28f5858dd826 commits,
- version 276cf91ece25 commits,
- version 2670b4515024 commits,
- version 25d2667a3b23 commits,
- version 24d5ca67ee23 commits,
- version 23d5ca67ee38 commits,
- version 220022dc1737 commits,
- version 212e78c77336 commits,
- version 2073cd5a0635 commits,
- version 1979d099aa34 commits,
- version 18b867976d33 commits,
- version 17e4b1118332 commits,
- version 16388b3d3b31 commits,
- version 15f625fa3330 commits,
- version 14116a990629 commits,
- version 1382b4203928 commits,
- version 12d526366a27 commits,
- version 116c8d27d126 commits,
- version 10029dec0b25 commits,
- version 945f8045424 commits,
- version 858b5b71f23 commits,
- version 79010085621 commits,
- version 6db10081b20 commits,
- version 5ad7b02b319 commits,
- version 4d77e3c4a67 commits,
- version 3d760ee6766 commits,
- version 2ac498b8b65 commits,
- version 1c3f46cac64 commits,
- Side-by-side
- Inline