[RTA/DPA BW tests]: Add Sprucing jobs for Turbo and TurCal (and many generalisations/simplifications as this necessitated)
Goes with lhcb/Moore!3285 (merged) and lhcb-core/LHCbPR2HD!284 (merged)
Closes #17 (closed)
This huge MR is all in the aim of adding two new sprucing jobs (on Turbo and TurCal) for the hlt2_and_spruce
and spruce_latest
bandwidth tests.
The main refactoring that this incurred was to now allow multiple STREAM_CONFIG
s for each PROCESS
- before now the STREAM_CONFIG
was determined by the PROCESS
. Now we have 3 STREAM_CONFIG
s for sprucing, and all 3 related tests run in spruce_latest
and hlt2_and_spruce
.
Description of the changes are hidden in the sections below. I also added a round of comments to the diff to try and explain it given its size.
Simplifications/generalisations made whilst here (tried to factorise them in !404 (closed), but it got too intertwined)
- Simplify the generation and download of the on-the-fly sprucing config yaml files. (sorts out this discussion in #13)
- Put the list of lines for TISTOS in the on-the-fly config file. Now need to only generate and download 1 file.
- Remove nested Gaudi job that works out the rate denominator. Now we have a second job, after the main trigger job, that explicitly does that and only that.
- The job above is in
read_event_numbers.py
, which is a renaming oflist-event-numbers.py
. It now has two modes: one for counting up events in the input file (up toEVTMAX
); and the other for storing all event numbers in an output file. The latter is for overlaps, - Rename
generate_hlt2_fullstream_metadata.py
->generate_spruce_input_configs.py
to make use case more clear. - Introduce a
tmp/to_eos/
directory into which we put everything that the handler needs to grab and put undercurrent_hlt2_output
on eos. This means we don't need to align paths here and in the handler. - Trim down the list of helper functions,
- Remove a load of fancy stuff in
run_bandwidth_test_jobs
that no-one has run since Ella left, -
make_bandwidth_test_page.py
: no more hard-code string templates - now build everything with functions using f-strings for cleaner substitutions, -
make_bandwidth_test_page.py
: lots more modularization/factorization of functions - I pointed stuff out in the large diff -
make_bandwidth_test_page.py
: lots more (although probs not total) usage of type hints. This is just for ease of autocompletion and code highlighting. You also get some more clarity for free - Removal of last vestiges of hard-coded string paths in the html file names
Summary of changes required to add the two new sprucing jobs
- In many places you'll now see
stream_config
being passed around withprocess
, - The new streaming configurations "wgpass" and "turcal" have been introduced for their respective sprucing jobs,
-
message.txt
->message.json
for much cleaner reading/writing, -
make_bandwidth_test_page.py
is now called once by the top-level script (e.g.Moore_hlt2_bandwidth.sh
) - it has an internal loop over processes. This fixes this follow-up in #13. - A few aesthetic clean ups on the html pages, trying to make them easier to read,
- Each bandwidth test now has a top-level landing html page, and links to the per-test pages within.
Other improvements which were unblocked as a result of above
- Error handling should now be more robust: we now save the cumulative error code of each
Moore_bandwidth_test.sh
call tomessage.json
before making the html page. This means you should get Gitlab messages of failure now before the HTML page-maker fails. - We now have an automated bar chart for all streams to disk that appears on the top-level page.
Follow-ups
- I think
stream_config
can now be a member ofFileNameHelper
- this would require lots of changes but would be a nice simplification, - Line descriptors: still need to use the same streaming configuration as the rest of the test. We can load up the stream config JSON and use it to filter the lines by name to ensure this,
- hrefs can be handled better in the html maker - we can write a helper in
bandwidth_helpers
for this, - Try to fix error code for
make_bandwidth_test_page.py
and then use it in the handler for more error checking - The spruce stream configs should be
["full", "turbo", "turcal"]
- changes required in lots of places, but this would obviate lots of mapping in the pages.
Draft pages with updated numbers can be found at https://rjhunter-bandwidth.web.cern.ch/
TODO:
-
Re-do the test job from scratch and rebuild docs -
Test the handler changes as thoroughly as you can 😨