Cleanup btagging configuration
This consolidates the configuration functions we use in several places:
- trigger,
- reconstruction,
- derivations, and
- retagging (which doesn't happen in the Athena repo).
Overview
Pictures are cool:
graph TD;
subgraph BTagAlgs [Configured by One Function]
assoc[JetParticleAssociation] --> jettag & finder
assocm[JetParticleAssociation muons] --> jettag
finder[JetSecVtxFinding] --> vx[JetSecVertexing]
vx --> jettag
jettag[JetBTagging] ==> muaug[BTagMuonAugmenter] & jetaug[BTagJetAugmenter]
muaug & jetaug ==> dl[Machine Learning Algorithms]
end
db[Calibration Database] --> jettag
tr(tracks) -.-> aug & vx & jetaug & dl
mu(Muons) -.-> assocm & muaug
aug[BTagTrackAugmenter] --> assoc
jet(jets) -.-> assoc & assocm & finder & vx & jettag
pv(Primary Vertex) -.-> finder & vx & jettag
json(List of NN Files) -.-> dl
dl ==> btag(BTagging Object)
This is a rough sketch of how information flows through the b-tagging code. Little boxes are algorithms, and the dotted lines indicate where some information needs to be passed into them. Solid lines indicate objects that are created internally. In the solid line case, the name of the object has to be synchronized between algorithms. The BTagging
object follows the thick black line.
The key point here is that this logic is implemented in 4 different places.
The problem
Modular stuff is good. But in this case the 4 implementations are all supposed to do the same thing. The underlying code is also a bit crufty and some of the boxes should probably be split or merged, which is really hard to do in a few places coherently.
So for now it's probably better to merge everything. The big gray box holds the algorithms we configure with one function. We want to put everything in the gray box! Once we've done that we can move stuff around to make it easier to pop new boxes in and take old boxes out.
Implementation
The idea was to move everything into the box, but that turned out to be a bit more difficult. Instead most of the tagging calls three functions:
- a calibration database setup,
- track augmentation, and
- the top level b-tagging one (the gray box).
I wasn't able to merge the first two into the last because:
- The conditions database setup function for derivations can't be replaced by
JetTagCalibCfg
for some reason. It seems to break the muon conditions alg when I try. - Track augmentation has to be separated from the rest of tagging, because the retagging code uses a view container which is the union of two other containers. Both of these have to be augmented before the containers are merged.
In the process of implementing this I made some other improvements:
-
ATLASRECTS-6635: Some small progress on cleaning up
CompFactory
calls on imports -
ATLASRECTS-6172: Add soft muon scalars to the
BTagging
object - Move configuration functions that call
FlavorTagDiscriminants
intoFlavorTagDiscriminants
- Add or clean up some
ConfigFlags
:- Merged
run2TaggersList
andRun2TrigTaggers
intotaggerList
. They are the same taggers, but if they ever diverge theConfigFlags
aren't shared between reconstruction, trigger, and derivations anyway. - Made the
taggerList
depend on whether we've enabledRunFlipTaggers
- Moved
calibrationChannelAliases
toConfigFlags
, cleaned it up considerably - Added a
forcedCalibrationChannel
option, which tells every tagger to use a specific calibration channel
- Merged
- Updated the "retagging" store gate renaming functions to be the ones we actually use for retagging
I also deleted and simplified a lot of unused code.
Validation
This causes no changes in any physics outputs (I checked trigger and derivations). I've done some tests on DAOD_PHYS
and DAOD_FTAG1
. The only changes to FTAG1 (over a few hundred events) were the addition of softMuon
variables.
Built on nightly 2022-03-21T2101
Implications for developers
A few things have moved around, so I'll give a short guide on where to find them now. Everything that runs the main tagging chain will now have a call like
BTagAlgsCfg(cfgFlags, JetCollection, nnList)
Which is defined in BTagging/BTagRun3Config.py. The nnList
is a list of all dips and dl1 taggers for that specific collection. There are also optional arguments for the trackCollection
, muons
, and primaryVertices
(by default they are the standard offline ones).
A lot more options have also been moved to BTagging/BTaggingConfigFlags.py. These include calibrationChannelAliases
and the taggerList
, the later of which has a default that depends on whether the flip taggers are enabled.
To Do
I left out a few things that should be discussed with the flavor tagging group, or that might depend on external developments:
- Figure out why I can't use the same calibration database setup function in derivations.
- Do track collection merging as part of this function.
- Consider using
forcedCalibrationChannel
in more places. We might have PFlow specific trainings for taggers that use the calibration database, and I'm pretty sure we don't have anything specific for variable radius track jets. If the trainings are all identical we could replace the channel aliases with an empty list and map everything to one jet collection. - Enable muon information in reconstruction jobs. Right now the data dependencies for BTagMuonAugmenter aren't correct, which leads to random crashes in Athena MT. Derivations are single thread for now, so the muons still run there.
- Disable
MV2c10
in the trigger code. See ATR-25239.