Re-adding documentation markdown

5959592a · Teng Jian Khoo · 403d7481 · 5959592a · 5959592a
Commit 5959592a authored 4 years ago by Teng Jian Khoo
--- a/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Jet/README.md
+++ b/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Jet/README.md
-# Overview of HLT jet reco configuration modules
+Jet Trigger Configuration Overview
+=====

-## GenerateJetChainDefs.py
+Trigger chains are structured as a series of *steps*, where each step can contain *reconstruction* & *hypothesis* elements, so as to allow for early rejection of uninteresting events.

-Called by the menu code in `[TriggerMenuMT/python/HLTMenuConfig/Menu/GenerateMenuMT.py](https://acode-browser1.usatlas.bnl.gov/lxr/source/athena/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Menu/GenerateMenuMT.py)` to translate the HLT chain item into a concrete algorithm sequence.
+A typical jet trigger chain may look like:
+```mermaid
+graph TD;

-The menu code creates a chain dictionary from the chain name, of which the jet parts are given to `GenerateJetChainDefs.generateChainConfigs` to be interpreted by `JetChainConfiguration`.
+  L1[Step: L1 seed] --> R1(Reject):::reject
+  L1 --> HLT1;
+  HLT1[Step: HLT calo jet presel] --> R2(Reject):::reject;
+  HLT1 --> HLT2;
+  HLT2[Step: HLT track jet final] --> R3(Reject):::reject;
+  HLT2 --> A(Accept):::accept;

-## JetChainConfiguration.py
+  classDef reject fill:#0dd;
+  classDef accept fill:#dd0;
+```
+Each of the steps will contain a reconstruction sequence, responsible for generating the objects to be selected on, and a `HypoAlg` that reads the created (jet) collection. The `HypoAlg` will contain one or more `HypoTools`, each of which represents a different selection. Jet `HypoTools`  receive all jets in the event, and then apply a selection, which may be on individual jets or on groups of jets, unlike most other trigger signatures, in which objects are always selected on independently and then potentially combined.

-Defines the `JetChainConfiguration` object responsible for interpreting the chain dictionary and building a `Chain` object that is returned to the menu.
+Algorithms & data
+-----

-This extracts the reco configuration from the jet chain dictionary, using it to generate a `MenuSequence` that forms the jet `ChainStep`. It may be desirable to implement multiple `ChainSteps` for filtering purposes in the future, mainly to allow fast reco and filtering before slower reco is executed.
+One way to envision the decision process is as an interaction between data and algorithms:
+```mermaid
+sequenceDiagram
+  participant SG as Data (StoreGate)
+  participant Reco as Reconstruction
+  participant Hypo as Hypo selection
+  Hypo ->> Reco: L1 accept
+  activate Reco
+  SG -->> Reco: Topoclusters
+  Reco ->> SG: Calo jets
+  deactivate Reco
+  SG -->> Hypo: Calo jets
+  Hypo ->> Reco: HLT step 1 accept
+  activate Reco
+  SG -->> Reco: Topoclusters, Tracks
+  Reco ->> SG: Particle Flow Objects
+  SG -->> Reco: Particle Flow Objects
+  Reco ->> SG: Particle Flow Jets
+  deactivate Reco
+  SG ->> Hypo: Particle Flow Jets
+  Hypo ->> SG: HLT Decision
+```
+In this (simplified) picture, the HLT chain is implemented as repeated interactions of the form:
+* Upstream hypo decision unlocks step reconstruction
+* Reconstruction reads data from StoreGate, and adds new object collections
+* Hypo reads the collections produced by Reco and decides whether to accept the event for the next step.
+The final step accept causes the event to be written out.

-`JetChainConfiguration` internally calls `JetMenuSequences.jetMenuSequence()` to produce the `MenuSequence` item, which contains a reco sequence followed by a hypo selection.
+Multithreaded algorithm scheduling
+-----

-## JetMenuSequences.py
+Although simple to understand, this linear picture may be misleading. Because the Run 3 ATLAS software runs multithreaded, many algorithms can run in parallel, accessing different pieces of the event data. Therefore, scheduling the reco & hypo algorithms is done based on data dependencies. That is, each algorithm specifies its inputs and outputs. This allows the *scheduler* to figure out when any given algorithm should run.

-The jet chains are currently made up of only a single step, which runs reco+hypo. The menu sequence is constructed from:
-* a concrete reco sequence, generated by `JetRecoSequences.jetAthSequence()`;
-* the `InputMakerAlg` for this sequence (largely irrelevant as jets work in FullScan rather than in EventViews);
-* a hypo alg (one instance per jet collection i.e. reco config); and
-*  a hypo tool generator function that will define the selection based on the `chainDict` contents (configuration defined in `TrigHLTJetHypo`).
+The scheduler creates a graph showing how the algorithmic flow would proceed for one chain:
+```mermaid
+flowchart TD;
+  L1a(L1Accept):::decision --> TopoClusterMaker:::alg
+  
+  subgraph Step 1
+  
+  TopoClusterMaker:::alg --> Topoclusters[/Topoclusters/]:::data
+  Topoclusters --> CaloJetReco:::alg
+  CaloJetReco --> CaloJets[/CaloJets/]:::data
+  CaloJets --> CaloJetHypo:::alg
+  
+  end 

-The reco sequence and hypo alg names are suffixed with a string summarising the reco options.
+  CaloJetHypo --> HLT1A(HLT step 1 accept):::decision
+  
+  Topoclusters --> PFlow:::alg
+  HLT1A --> FSTrk[FS Tracking]:::alg

-## JetRecoSequences.py
+  subgraph Step 2
+  
+  FSTrk --> Tracks:::data
+  Tracks --> PFlow
+  PFlow --> PFOs[/PFlowObjects/]:::data
+  PFOs --> PFlowJetReco:::alg
+  PFlowJetReco --> PFlowJets[/PFlowJets/]:::data
+  PFlowJets --> PFlowJetHypo:::alg
+  
+  end
+  
+  PFlowJetHypo --> HLT2A[HLT final accept]:::decision

-This module provides the `jetAthSequence()` and `jetRecoSequence()` functions that define the reconstruction sequence for any given reconstruction configuration.
+  classDef data fill:#dd0;
+  classDef alg fill:#faf;
+  classDef decision fill:#aff;
+```
+The step decisions function as gates which enable the succeeding algorithms to be activated, and within each subgraph, the availability of one algorithm allows the next to be run.

-The `jetAthSequence` holds the full reconstruction as well as an `InputMakerAlg` which for FullScan jet triggers (AFAIK) mostly serves to communicate L1 seed information to determine which hypo tools should be run.
+However, it is more interesting to consider what happens when we have many chains in parallel, e.g. some running only calo reco, some using large-radius jets and others running small-radius jets. This could expand to a situation like the following:
+```mermaid
+flowchart TD;

-The reconstruction sequence is determined from the contents of the reco information in the `chainDict`, and will contain some subset of:
-* Calo reco sequence: cell unpacking and topoclustering -- one instance shared between all jet chains
-* Constituent modifications [optional] -- pileup suppression on topoclusters or corrections to PFlow four-vectors
-* PseudoJetGetters & algs -- conversion of the ATLAS EDM into `fastjet` EDM
-* JetAlgorithm -- holds the jet finder tools and any modifiers e.g. calibration
-The reco sequence may be nested further in the case of reclustering or trimming workflows, in which case the "basic" jet reco from clusters is embedded in a second sequence, which continues by running the second step reconstruction.
-Configuration of the Athena components is handled by the `Reconstruction/Jet/JetRecConfig` package.
\ No newline at end of file
+  Start --> L1A(L1Accept A):::decision
+  Start --> L1B(L1Accept B):::decision
+  Start --> L1C(L1Accept C):::decision
+  
+  L1A --> Filter1a:::filter
+  L1B --> Filter1b:::filter
+  L1C --> Filter1a:::filter
+  
+  subgraph Step 1
+
+    Filter1a -.->|activates| TopoClusterMaker
+    Filter1b -.->|activates| TopoClusterMaker
+
+    Filter1a -.->|activates| CaloJetRecoA4
+    Filter1b -.->|activates| CaloJetRecoA10
+
+
+    TopoClusterMaker:::alg --> Topoclusters:::data
+    Topoclusters --> CaloJetRecoA4:::alg
+    Topoclusters --> CaloJetRecoA10:::alg
+
+    CaloJetRecoA4 --> AntiKt4CaloJets:::data
+    AntiKt4CaloJets --> CaloJetHypoA4:::alg
+    CaloJetRecoA10 --> AntiKt10CaloJets:::data
+    AntiKt10CaloJets --> CaloJetHypoA10:::alg
+
+  end 
+
+  CaloJetHypoA4 --> HLT1A(HLT step 1 accept A):::decision
+  CaloJetHypoA10 --> HLT1B(HLT step 1 accept B):::decision
+  CaloJetHypoA4 --> HLT1C(HLT final accept C):::decision
+
+  
+  Topoclusters --> PFlow:::alg
+  
+  HLT1A --> Filter2a
+  HLT1B --> Filter2b
+
+  subgraph Step 2
+
+    Filter2a -.->|activates| FSTrk[FS Tracking]:::alg
+    Filter2a -.->|activates| PFlowJetRecoA4
+
+    Filter2b -.->|activates| FSTrk
+    Filter2b -.->|activates| PFlowJetRecoA10
+
+    FSTrk --> Tracks:::data
+    Tracks --> PFlow
+  
+    PFlow --> PFOs[PFlowObjects]:::data
+    PFOs --> PFlowJetRecoA4:::alg
+    PFOs --> PFlowJetRecoA10:::alg
+
+    PFlowJetRecoA4 --> AntiKt4PFlowJets:::data
+    AntiKt4PFlowJets --> PFlowJetHypoA4:::alg  
+
+    PFlowJetRecoA10 --> AntiKt10PFlowJets:::data
+    AntiKt10PFlowJets --> PFlowJetHypoA10:::alg 
+
+  end
+  
+  PFlowJetHypoA4 --> HLT2A(HLT final accept A):::decision
+  PFlowJetHypoA10 --> HLT2B(HLT final accept B):::decision
+
+  classDef data fill:#dd0;
+  classDef alg fill:#faf;
+  classDef filter fill:#ddd;
+  classDef decision fill:#aff;
+```
+
+This is not an exact description, but it illustrates a few important points:
+* Filters unlock segments of the execution graph.
+* Data may be used multiple times by different parts of the reconstruction.
+* Algorithms can run as soon as (but only when) all of their input data become available.
+
+With this big picture in mind, the following links describe further details of the HLT jet python configuration:
+* [ModuleOverview.md](./docs/ModuleOverview.md) -- Overview of what each python module does
+* [Jet section in SignatureDicts.py](../Menu/SignatureDicts.py#L99-168) -- Annotation of jet `chainParts` in SignatureDicts module
--- a/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Jet/docs/ModuleOverview.md
+++ b/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Jet/docs/ModuleOverview.md
+Overview of HLT jet reco configuration modules
+=====
+
+<details>
+<summary>Note on RecoFragmentsPool</summary>
+
+As an intermediate step before the new job configuration (see https://atlassoftwaredocs.web.cern.ch/guides/ca_configuration/) is fully adopted, the [`RecoFragmentsPool`](https://gitlab.cern.ch/atlas/athena/-/blob/master/Trigger/TriggerCommon/TriggerMenuMT/python/HLTMenuConfig/Menu/MenuComponents.py) construct is used to avoid duplication of algorithms within AlgSequences. This is used as follows. Wherever a function returning a sequence is called, the call should be made through the following expression:
+```
+mySequence = RecoFragmentsPool.retrieve( seqGenerator, configFlags, **kwargs )
+```
+where `seqGenerator` is a function that will return the desired sequence type, and must receive a one positional argument (`configFlags`) and an arbitrary number of keyword arguments (`**kwargs`). Internally, `RecoFragmentsPool` will cache the result, mapping it to the input arguments. Consequently, the kwargs must be hashable types (basically, no dicts). In the jet trigger configuration code, we frequently use the `**` operator to convert between dicts and kwargs.
+
+</details>
+
+<details>
+<summary>Note on sequence types</summary>
+
+The trigger algorithm sequencing is controlled by sequences and filter algorithms. For a more detailed description of how this functions, see e.g. [these slides](https://cds.cern.ch/record/2642559?ln=en).
+
+Essentially, two types of sequence are used:
+* parOR: executes all children in parallel and returns the result of an OR over these as its filter decision -- used for reco algorithms
+* seqAND: executes all children in sequence and returns the result of an AND over these as its filter decision -- used to activate/deactivate subsequences.
+An algorithm may be a child of multiple sequences, but will only execute once.
+
+These are used as depicted in the diagram below. Each step is built with an OR that executes all filters for the step (e.g. these could be electron & muon legs of an e/mu chain). The filters return a pass/fail decision based on preceding hypos. Then the main OR attempts to execute all substeps in parallel. The same filter algorithms are placed in ANDs within each substep, so they block execution if the filter criterion failed (Filter A). Results from the hypos are passed to the next step.
+
+```mermaid
+graph LR;
+
+  topAND[Step 1 AND]:::and --> prefiltOR[Prefilter OR]:::or
+
+  subgraph step1[Step 1]
+
+    prefiltOR --> filtA(Filter A):::filterfail
+    prefiltOR --> filtB(Filter B):::filterpass
+
+    topAND --> stepOR[Main OR]:::or
+    stepOR --> seqA[AND A]:::and
+    stepOR --> seqB[AND B]:::and
+
+    subgraph Substep A
+      seqA --> filtA2(Filter A):::filterfail
+      filtA2 -.->|blocks| orA[OR A]:::or
+      orA -.->recoA([Reco A]):::reco
+      orA -.-> hypoA([Hypo A]):::hypo
+    end
+  
+    subgraph Substep B
+      seqB --> filtB2(Filter B):::filterpass
+      filtB2 --> orB[OR B]:::or
+      orB --> recoB([Reco B]):::reco
+      orB --> hypoB([Hypo B]):::hypo
+    end
+  
+  end
+
+  classDef filterpass fill:#0d0;
+  classDef filterfail fill:#d00;
+
+  classDef reco fill:#ffa;
+  classDef hypo fill:#faa;
+  
+  classDef or fill:#0ff;
+  classDef and fill:#f0f;
+```
+
+This scheduling is basically handled by menu construction. As a rule of thumb, we use a `seqAND` as the basis for every `MenuSequence`, and `parOR` for all reconstruction sequences.
+
+</details>
+
+[GenerateJetChainDefs](../GenerateJetChainDefs.py)
+-----
+
+Called by the menu code in [`TriggerMenuMT/python/HLTMenuConfig/Menu/GenerateMenuMT.py`](../../Menu/GenerateMenuMT.py) to translate the HLT chain item into a concrete algorithm sequence.
+
+The menu code creates a chain dictionary from the chain name, of which the jet parts are given to `GenerateJetChainDefs.generateChainConfigs` to be interpreted by `JetChainConfiguration`.
+
+[JetChainConfiguration](../JetChainConfiguration.py)
+-----
+
+Defines the `JetChainConfiguration` object responsible for interpreting the chain dictionary and building a `Chain` object that is returned to the menu. `JetChainConfiguration` extends the [`ChainConfigurationBase`](../../Menu/ChainConfigurationBase.py) type.
+
+Its `assembleChain` function extracts the reco configuration from the jet chain dictionary, using it (via functions from `JetMenuSequences.py`) to generate a `MenuSequence` that forms one or more jet `ChainStep` objects. Multiple `ChainSteps` may be combined into a single chain for filtering purpose, mainly to allow fast reco and filtering before slower reco is executed.
+
+The following possible types of `ChainStep` are defined:
+1. Calo Hypo ChainStep: a single-step chain, with only calo reco defining the hypo selection
+2. Calo Reco ChainStep: a step that performs calo (cell+topocluster) reco with a passthrough hypo, which should be followed by a Tracking Hypo ChainStep. No jets are reconstructed.
+3. Calo Presel ChainStep: a step that performs calo reco with a preselection hypo, which should be followed by a Tracking Hypo ChainStep. Calo jets will be reconstructed and used as input to the preselection hypo. The preselection criteria are determined from the "preselNjX" entry in the chain dictionary.
+4. Tracking Hypo ChainStep: a step that performs (FullScan) tracking reco, possibly including Particle Flow, then reconstructs jets with tracks, and performs the final hypo selection on these. Must follow either a Calo Reco or a Calo Presel ChainStep.
+5. TLA ChainStep: This step is appended for TLA chains, following the terminal hypo (calo or tracking). It performs no selection, and only flags a subset of the reconstructed jets to be written to file. Configuration for this is in `JetTLASequences.py`
+
+[JetMenuSequences](../JetMenuSequences.py)
+-----
+
+Defines the `MenuSequence` objects that form the basis of `ChainSteps`. Each `MenuSequence` contains a reco sequence, an `InputMaker` (which defines the Region of Interest for the reco), a hypo algorithm and a hypo tool generator function. These are created as follows:
+* Reco sequence -- defined via functions in `JetRecoSequences.py`, based on the `JetRecoDict` extracted from the chain dictionary.
+* InputMaker -- varies depending on which sequence is needed. The basic fullscan InputMaker is provided by calo code.
+* Hypo algorithm -- In most cases this is a `TrigJetHypoAlgMT` defined in the `TrigHLTJetHypo` package. For the calo reco sequence only, this is a streamer hypo that does no selection.
+* Hypo tool generator -- For standard steps with hypo selection, this is the `TrigHLTJetHypo.TrigJetHypoToolConfig.trigJetHypoToolFromDict()` function, which will interpret the chain dict. For passthrough calo reco steps, this is instead just a function that returns a streamer hypo tool.
+
+*Note: For every jet collection there is exactly one matching hypo algorithm, but every distinct selection on that collection is defined by a different hypo tool that is a child of this hypo algorithm.*
+
+[JetRecoSequences](../JetRecoSequences.py)
+-----
+
+This module provides the functions that define the reconstruction sequences for any given reconstruction configuration.
+
+The reconstruction sequence is determined from the contents of the reco information in the `chainDict`, and will contain some subset of:
+* Calo reco sequence: cell unpacking and topoclustering -- one instance shared between all jet chains
+* Constituent modifications [optional] -- pileup suppression on topoclusters or corrections to PFlow four-vectors
+* PseudoJetGetters & algs -- conversion of the ATLAS EDM into `fastjet` EDM
+* JetAlgorithm -- holds the jet finder tools and any modifiers e.g. calibration
+The reco sequence may be nested further in the case of reclustering or trimming workflows, in which case the "basic" jet reco from clusters is embedded in a second sequence, which continues by running the second step reconstruction.
+Configuration of the Athena components is handled by the [`Reconstruction/Jet/JetRecConfig`](../../../../../../../Reconstruction/Jet/JetRecConfig) package.
+
+The main `jetRecoSequence()` function forwards to one of:
+* `standardJetRecoSequence()` -- defines standard jet reconstruction from input pseudojets (clusters, PFOs). This in turn calls `standardJetBuildSequence()` to perform the reconstruction of uncalibrated jets, and if calibration is requested additionally schedules a `JetCopier` to shallow-copy and calibrate the jets.
+* `groomedJetRecoSequence()` -- grooms an input jet collection (internally calls `standardJetBuildSequence()` to create the ungroomed jets).
+* `reclusteredJetRecoSequence()` -- reclusters an input jet collection (internally calls `standardJetBuildSequence()` to create the basic jets)
+
+[JetRecoConfiguration](../JetRecoConfiguration.py)
+-----
+
+Helper functions to facilitate the operations in `JetRecoSequences.py`:
+* Extraction & compression of the `JetRecoDict`
+* Translation of the `JetRecoDict` contents into the configuration objects from `JetRecConfig`, e.g. `JetConstituent` and `JetDefinition`.
+* Definition of jet modifier lists
+
+[JetTrackingConfig](../JetTrackingConfig.py)
+-----
+
+Helper functions to configure the jet tracking instance and to define track-related modifiers and collections within the jet domain.
+
+[JetTLASequences](../JetTLASequences.py)
+-----
+
+Helper functions to configure the TLA jet sequence.
+
+[TriggerJetMods](../TriggerJetMods.py)
+-----
+
+Definitions of specialised jet modifier configurations unique to the trigger context.
+
+[generateJet](../generateJet.py)
+-----
+
+Prototype code for the new job configuration.
+