How to define HLT2 (& sprucing) stream configurations?

TL;DR HLT2 (and sprucing) streaming configurations need to be defined in only one place. That place must be chosen.

FYI @mvesteri @poluekt @sstahl @enoomen @shunan @abertoli @nskidmor @cmarinbe

Line authors/RTA WG liaisons need clarity on exactly where to specify if a line is FULL, Turbo, TurCal etc. I've asked for this in RTA WP3 a couple of times and the answer was to do this in hlt2_pp_commissioning.py, which in QEE we have done. I believe this should be done in only 1 place, and ideally that place is not too far away from the definition of the lines themselves.

Currently we have 4 places (to my knowledge) where HLT2 streaming configurations are defined in the project:

Moore/Hlt/Hlt2Conf/python/Hlt2Conf/settings/hlt2_pp_commissioning.py
Moore/Hlt/Hlt2Conf/tests/options/bandwidth/hlt2_bandwidth_{5streams, 16streams, streamless}.py

where the latter 3 are what is used for the nightly LHCb-PR bandwidth tests, while the former is what I've been told is what the real productions will start from as a basis for a TCK.

AFAIK (which is not very far) there is not yet a sprucing streaming configuration around.

I think we shouldn't be having 2 or 3 places (which are also far away from the line definitions) where the HLT2 stream configurations are defined. Currently (AFAIK) there is no guidance written down anywhere on where this should be defined.

We/the coordination should take a decision on this, and unify the HLT2 streaming configuration so that all tests use the same configuration (barring of course if we have tests which try out/study different configurations, although we should find a way to make sure that, even if different streaming configurations are being tried in different tests, the persistency of each line remains the same i.e. a FULL line is always a FULL line in any configuration, because that is what the line author wants). While we're at it, we can also specify a place where any sprucing streaming configurations will be defined.

Suggestions for where to define persistency stream:

A property of the line itself, like line.persistReco can be, which can just be picked up in any production job,
In each WG's hlt2_{wg}.py; we could require that each of these files provides e.g full_lines, turbo_lines, turcal_lines or a subset of those. Any other ways of registering lines can be rejected with a loud error,
We have a single hlt2_pp_commissioning.py 5-stream configuration, which is documented appropriately to be the place where all this is defined.

QEE has provided an example of how to do a combination of the 2nd and 3rd points via hlt2_qee.py and hlt2_pp_commissioning.py. I think that works OK, although we'd need to be alert to any other stream configuration being defined anywhere else, so probably a better name like hlt2_pp_streaming_configuration.py is needed to make it clear that this is the place where streams are defined.

Once defined, I recommend that the WG RTA/DPA liaisons undertake the necessary bookkeeping changes for all their lines. Asking the line authors to do it themselves IMO will add lots of inertia, and the task shouldn't be a large one for the liaisons.

A related point: I think it is still possible to have persistReco=True & stream=Turbo, or persistReco=False & stream=Full. My memory of the Upgrade Computing TDR is that both these cases wouldn't be supported. If so, we probably shouldn't support them.

Edited Jun 27, 2023 by Ross Hunter