Multi-event scheduler
A multi-event-scheduler
What's that
This is a scheduler can execute a control and dataflow configuration, similarly to HLTControlFlowMgr in LHCb.
(see https://iopscience.iop.org/article/10.1088/1742-6596/1525/1/012052/pdf if you are not familiar)
Sharing functionality
Much of the functionality that we introduce here is already existent in Moore and the CPU HLT. Specifically how control and dataflow is set up in the configuration: CompositeNodes define control flow constraints and data flow constraints are defined in the background by requiring that a producer of data runs before its consumer. We have ripped out Moore/PyConf functionality for that (MiniPyConf here), specifically the data_flow and components modules.
Overall, Allen configuration looks similar to Moore configuration with this setup.
How does it work
-
From the CompositeNode tree that defines the application, we first extract execution masks for all algorithms. Consider a simple tree: You would like to run algorithms A,B,C in a lazy fashion, with a connecting OR. The masks that the scheduler extracts from this tree are given in the Leafs. B only has to run in case A did not pass, and C only if B and A did not pass.
-
For algorithms that appear multiple times, we merge the execution lists with an ANY relationship
-
we simplify the boolean mask expressions using a solver (sympy.simplify)
-
we gather data and control flow dependencies:
- data dependencies are found by backtracing inputs (PyConf feature)
- control flow dependencies are extracted by parsing the simplified boolean expression (from 3.) back into control flow trees and extracting the algorithms.
-
we order algorithms
- data dependencies serve as constraints
- control flow dependencies serve as soft constraints
- we might find ourselves in a position where data & control flow dependencies cannot all be accounted for, in which case we loosen the control flow constraints by loosening the execution mask for one of the algorithms that is insertable according to data dependencies. Generic mask loosening is done by substituting an algorithm in a mask by True or False and then simplifying again. Example:
(A & B) -> loosen by B -> (A & True) | (A & False) -> A
-
we receive an ordered collection of algorithms with their respective execution masks
-
For every unique, nontrivial execution mask we build a combiner algorithm that is inserted in the sequence right before the first algorithm with that execution mask
-
The allen executable sequence is generated and compiled
-
Execution works as follows:
- Algorithms with execution mask
True
are executed on every event - Execution is governed by event lists: Algorithms execute on every event in the event list that they get as input
- Algorithms that can reduce the event list, like the GEC, export one as output, which is then consumed by algorithms with
GEC
as execution mask - For more complicated masks, like
GEC & BLUB
, the combiner algorithms take care of event list union(OR) / intersection(AND) / inversion(NOT) before the algorithm actually executes
- Algorithms with execution mask
Algorithm order optimizations
With every control flow tree there are multiple possible orderings that the scheduler might consider, and the throughput depend on these orders.
There are two types of order swaps that one might consider: A lazy control flow node that does not require a specifc order is one where swapping orders yields different execution masks for different algorithms. As simple heuristic, one can assume that more expensive algorithms are better associated with sparse execution masks.
Defining how expensive an algorithm is and how sparse an execution mask is, is a highly non-trivial task. In fact, doing so perfectly requires actually running the application in all possible orders, which is something we would like to avoid. Currently, we hardcode educated guesses for the weight of an algorithm execution. Some weights are taken from a profile run. Some other heuristics help in setting weights that result in acceptable sequences, like the fact that data providers should be spread out as far as possible to not create io bottlenecks.
In Summary, trying to model accurate execution weights and average efficiencies for algorithms in this heterogenous architechture seems like a bad idea. Instead, it might make more sense to employ optimization algorithms that operate over possible orderings and test each ordering with a quick benchmark, automatically. We expect this procedure to take a long time to complete, but maybe we don't have to optimize on such a high level for too many configurations. Ideas include genetic algorithms or simulated annealing. (none of these are implemented yet, but thats not the scope of this MR anyway)
Built on top of !393 (merged)
TODO:
-
Some more tests for the python core functionality, specifically cftree_ops.py and event_list_utils.py -
Merge minipyconf and pyconf (done in LHCb!2964 (merged)) -
Cleanup of the physics configurations (part of this MR) -
Check that every host_datatype
is only assigned tohost_datatype
s, and that everydevice_datatype
is only assigned todevice_datatype
s. -
Update documentation. -
Once LHCb!2964 (merged) is merged, change GenerateConfiguration.cmake
to useHEAD
instead of a branch of LHCb.
List of changes
One general remark: this MR changes the code generation steps of Allen, but otherwise it does minimal changes to the rest of the codebase. That means that for the most part, with the exception of the introduction of MASK_
types and the event list intersection, union and inversion
, all headers / sources are not modified.
Here is a list of requirements and changes introduced to Allen as part of this MR:
- Pregenerated sequences are removed. Python 3 and libClang are now requirements.
-
git
is required inSTANDALONE
to be able to fetch PyConf from LHCb. - The option
SEQUENCE_GENERATION
is therefore gone as well, and so is its CI job. - The obsolete
gaudi
configurations and previous configurations are gone. - The following directory structure has been created:
AllenConf
contains an "extension" to PyConf to enable Multi Event Scheduling,sequences
contains all sequences,sequences/definitions
contains definition files used by the sequences, andtests
includes MES checks. - The following configurations exist and can be therefore passed to the cmake
SEQUENCE
option:
|-- sequences
| |-- forward.py
| |-- hlt1_complex_validation.py
| |-- hlt1_pp_default.py
| |-- hlt1_pp_no_gec.py
| |-- hlt1_pp_no_gec_validation.py
| |-- hlt1_pp_non-restricted_UT.py
| |-- hlt1_pp_scifi_v6.py
| |-- hlt1_pp_scifi_v6_validation.py
| |-- hlt1_pp_validation.py
| |-- muon.py
| |-- pv.py
| |-- velo.py
| `-- veloUT.py
-
sequences/definitions
files have been refactored heavily. Files now identify with subdetector reconstructions (eg.velo_reconstruction.py
,ut_reconstruction.py
), there are different files for lines (eg.hlt1_technical_lines.py
,hlt1_muon_lines.py
), validators, persistency, and so on. - A number of python tests have been added to run as part of the Allen CI to test the MES functionality.
- A complex sequence test that runs various instances of reconstruction algorithms has been added.
- Combiner algorithms
event_list_intersection, event_list_union and event_list_inversion
have been added. - All event list arguments have become
MASK_INPUT
orMASK_OUTPUT
. There can be at most a singleMASK_INPUT
and a singleMASK_OUTPUT
parameter per algorithm, which are used internally by the MES and don't need to be configured as part of the algorithms.
Should be merged after LHCb!2964 (merged) (and should be tested with it too).
Merge request reports
Activity
added 1 commit
- ac498b8b - Generated all pregenerated sequences, fixed warning, fixed includes of event_list_utils.
added 111 commits
-
ac498b8b...88841c51 - 45 commits from branch
master
- cd5b83fd - Goudi lines.
- a71805ee - Created self-contained line algorithm.
- ca3f4e21 - Created a OneTrackLine extensible struct.
- 542ca64d - Wrote the algorithm TrackMVALineAlgorithm such that the parser picks it up.
- d7e71ac8 - Removed unnecessary explicit initialization of decisions_t.
- 9e6bf76c - Added first two track mva line with new line configuration syntax.
- 001de75a - Allowed to configure kernel call grid dim and block dim.
- 4f9d990d - Removed unnecessary inline keyword.
- c8ca319e - Created gather selections algorithm.
- eaaa2966 - GatherSelections compiles and runs.
- f99dec37 - Updated readme of GatherSelections.cuh.
- 5968fd25 - Moving to using dev_event_lists everywhere.
- 7e4f127c - Updated VELO configuration, refactored selected number of events into number of events.
- f2c5d50a - Modified generation of arguments.py in order to support aggregate algorithm generation.
- 7504514d - Fixed some index access errors.
- 883876e9 - Fixed VELO sequence.
- ebb5cc4a - Updated UT sequence to support dev_event_list_t.
- 437fb51a - Updated PV sequence to using event lists.
- c52fc89b - Updated forward sequence to using dev event lists.
- b09c14e6 - Added dev event lists to muon sequence, vertexing and kalman filters.
- 547f2115 - Added event list to lines.
- 0aa3174a - Proper prefix sum for GatherSelections.
- 5da8a7e4 - Fixed prefix sum in Gather selections and added the postscaler and light tested it.
- 78beb55b - Adapted RateChecker to new data format.
- c76b0af7 - Added ODIN line. Fixed offset population bug in lines. Removed inheritance...
- 030a2703 - Allow to configure several line algorithms in a sequence.
- 00637fbd - Added host memory configuration option.
- a0ff36e6 - Compatibility with MacOS of sequence generation.
- ba310086 - Removed algorithms prepareRawBanks forward, and cleaned up use of all old LineInfo.cuh.
- 32998b30 - Commented out zmq service send of passing_event_list.size.
- db13877e - Fixed ROOT build.
- 1cbc609a - Fixed formatting
- ead8ce66 - Updated SAXPY example.
- 1e0ea1d3 - Fixed formatting
- b8d4ee36 - Fixed TrackMVALine logic.
- 168d7ad5 - Created readme, created SelectionAlgorithm.cuh.
- 4705c70d - Updated CI.
- 57f01278 - Updated selections.md.
- 5a10805c - Added EventLine for ease to add lines. Added VeloMicroBiasLine.
- cc9e57ba - Updated selections readme.
- 50e3e372 - selections.md readme corrected
- 75d405fc - Removed two warnings.
- 678f841d - Fixed HLT1Sequence issue. Updated python3 found executable.
- 5d98b04a - Steps to make multievtscheduler work.
- ada4c2a4 - Working on combiners.
- 80568efc - Added combiner algorithms.
- 70f7c005 - Generated unique sequence. Refactoring and producing Allen sequence format.
- 159499b5 - First version of Allen sequence generator.
- 18c33cff - Moving to configuration/multi_evt_scheduler.
- 34376b7e - Removed external folder and duplicated code.
- ff85f3eb - Adding more sequences to be compatible with the multi evt scheduler.
- d82d98b1 - Added fast all_producers code.
- 4fddf652 - Added PV and muon sequences.
- 1f5e1b57 - Working lines.
- 545a2f92 - speed up the thang
- 18c39767 - Working on gather selections.
- 8a108a93 - Updated Allen sequence generator to support input aggregates.
- 51b51a51 - Removed old sequence files.
- 1a7347b7 - Add category to parsed algorithms.
- d3c72e4d - Added optional inputs to be aggregate inputs.
- e5bd7204 - Multi evt scheduler generates better default configurations.
- b14304ef - Updated Allen weights and weight assignment logic.
- 30652514 - Simplified control flow generation to pick the first possible option.
- cd62ad4c - Moved event list utils inside definitions and updated all configurations.
- 3a674fef - Removed inside joke.
- d760ee67 - Generated all pregenerated sequences, fixed warning, fixed includes of event_list_utils.
Toggle commit list-
ac498b8b...88841c51 - 45 commits from branch
added Build new feature labels
removed Build new feature labels
added Build new feature labels
mentioned in issue Moore#203 (closed)
mentioned in issue Moore#205 (closed)
mentioned in issue Moore#206 (closed)
mentioned in merge request !431 (closed)
- Resolved by Daniel Hugo Campora Perez
@dcampora what's the status here and in !393 (merged) ? is there still work going on?
added 446 commits
-
d77e3c4a...3b105ca1 - 427 commits from branch
master
- 6e4575d4 - Added one VELO sbt contract.
- f6e057c9 - First Allen contract working.
- 4920e818 - Use the simplified make_vector in VeloTools to fetch arguments.
- b34e5f27 - Refactored contracts, which now use the function require instead of plain asserts.
- 6b407428 - Decouple DNDEBUG with ENABLE_CONTRACTS.
- dbd444e4 - Add ifdef block around demangle function to avoid warning.
- 02ddc55c - Added generic contracts folder.
- 3dc27f8d - Fix Gaudi compilation and early-exit if contracts fail.
- 0a3c580f - Remove include directories of Catch2.
- 20b559e2 - Removed Contract.cuh. Added copyright statement.
- ef2b85fb - Fixed formatting
- 45c3086e - Added documentation on contracts.
- f54ad8ce - Added builds and runs with contracts.
- e1e3d91f - Updated selections.md.
- 4d19d8d7 - Provided a realistic example in contracts.md.
- 5b4efd63 - Added all essential files. CMake generation works.
- 2128e097 - Added missing files.
- 2644e368 - velo sequence compiles and runs.
- ad7b02b3 - Multi event scheduler works as intended.
Toggle commit list-
d77e3c4a...3b105ca1 - 427 commits from branch
added 1 commit
- db10081b - Refactored slightly hlt1_pp_default to have more straightforward names.