Skip to content

Fix selection mask scheduler logic

Christina Agapopoulou requested to merge fix_selection_masks into 2024-patches

This MR ensures that execution masks of selection algorithms are not simplified.

Explanation:

The sequence generator of Allen has the following optimisation logic:

In order to append an algorithm to the sortd order that is created in this function,
the algorithm must fulfill the condition of being dataflow-insertable. An algorithm is
dataflow-insertable if all algorithms in dataflow-dependencies (df_dependencies) have
already been met (ie. have already been inserted in sortd).

Algorithms for which controlflow-dependencies are also fulfilled, that is, algorithms that are
df_insertable and cf_insertable, are preferred over algorithms that are only df_insertable. However, it is
possible to reach a point where there are no algorithms that are cf_insertable. In the following example,
the top algorithm for A is a, the one for B is b, and the one for C is c. Example:

tree: (A & B) | (C & B)
sortd: [a]
df_insertable: [b, c]
cf_insertable: []

B and C both depend on each other to be cf_insertable. In this scenario, either the algorithm(s) in node B
or the algorithm(s) in C could be inserted. A more comprehensive description of this scenario can be found at:
https://codimd.web.cern.ch/jMxAOmYhR4-q9eRxnp6kRA

Therefore, the following rules are followed to determine an order:
* Always prefer algorithms that are both cf_ and df_ insertable over algorithms that are only df_ insertable.
* Use a heuristic to determine the algorithm between those that are cf_ and df_ insertable.
* Use a heuristic to determine the algorithm between those that are only df_ insertable.

The current heuristic for the above cases is "execute the least expensive one", according to its weight.

In the case that there are no cf_ and df_ insertable algorithms, a mask simplification logic is performed to remove unknown outcome algorithms and simplify the dependencies. This is not a problem, for let's say reconstruction algorithms, since the set of reconstructed objects needed for selections will be a subset of the reconstructed objects under the loosened mask. But the mask simplification logic is a problem for selection algorithms, given that masks are often used as part of the selection logic of technical lines (technical filter + passthrough line).

The possibility of this issue to appear has always been there, but only manifested recently due to a complicated dependency of some technical lines to velo track reconstruction, introduced by the combination of the velo_gec and DisableLinesDuringVPClosing filters. It was not caught by ci-tests because the ``DisableLinesDuringVPClosing logic is currently not enabled by default in2024-patches` (see !1708 (closed)), but was caught directly in production, where both filters were enabled manually.

Attached here are three log files of a local test on MEPs from run 297202 and the hlt1_pp_matching_no_ut_1000KHz.json sequence:

  • out_v4r11_noVeloClosing.log : release v4r11 and no additional hotfixes (so the DisableLinesDuringVPClosing flag is set to False). This is the configuration before any fix and with the issue not manifesting . Total output rate is 880.38 kHz and the BGI and SMOG line rates are as usual.
  • out_v4r11_withVeloClosing.log : release v4r11 + DisableLinesDuringVPClosing = True . This is a configuration similar to the one we used at the pit on the Saturday test (only missing the Upsilon routing bit) and the issue is reproducible (rate around 9 MHz, large rates of Hlt1SMOG2PassThroughLowMult5, Hlt1SMOG2BELowMultElectrons and the BGI lines)
  • out_fix.log : release v4r11 + DisableLinesDuringVPClosing = True + this MR. Rates identical to out_v4r11_noVeloClosing.log . Also the selection masks look as expected. Some reconstruction masks have changed (simplified) so some differences in reconstruction counters can be expected.

@samarian this should fix the issue you observed in another test.

@ahennequ you may want to take this into account for your pre-scaler MR.

@raaij @rmatev

out_v4r11_withVeloClosing.log out_v4r11_noVeloClosing.logout_fix.log

Merge request reports