[2024-patches] invalid MCTruth information when using production.hlt2

cc @sesen for comment, not sure how this affects plans for centralised Hlt2 MC processing.

Context

@mengzhen noticed lhcb-datapkg/AnalysisProductions!1111 (comment 8304649) that the Jpsi_TRUEID variable was surprisingly zero in many cases when following the suggestion from Example AP's 24c2 options. (These options are themselves based on the Moore docs and were updated following !3620 (merged)).

I spent some time locally reproducing the result and narrowed down the issue to the Hlt2 configuration (i.e. MC->HLT2->Tuple recreates the issue). Specifically it seems to be caused by using production.hlt2 rather than Moore.run_moore.

Moore/v55r11 and DaVinci/v64r8 was used, but I did a quick cross-check with Moore/v55r11p3 had no noticeable effect.

Further description of the symptoms

When using run_moore directly , I see that the Jpsi_TRUEID is filled well {'all events': 54, 'non-zero events': 53, 'non-zero fraction': 0.9814814814814815}

run_moore directly

def make_streams():
    lines = [qee_quarkonia_lines['Hlt2QEE_JpsiToMuMu_Detached']()]
    streams = [Stream(lines=lines, routing_bit=99,)]
    return Streams(streams=streams)


def main(options: Options):
    public_tools = [
        trackMasterExtrapolator_with_simplified_geom(),
        stateProvider_with_simplified_geom()
    ]
    with reconstruction.bind(from_file=False),\
         get_default_hlt1_filter_code_for_hlt2.bind(code=""),\
         config_pp_2024():

        run_moore(options, make_streams, public_tools=public_tools)

While using the production.hlt2 options as suggested for MC Central Productions (and thus suggested within the Moore docs) with either of the following options provides {'all events': 54, 'non-zero events': 2, 'non-zero fraction': 0.037037037037037035}

Using options.lines_maker in production.hlt2

def _make_streams():
    lines = [qee_quarkonia_lines['Hlt2QEE_JpsiToMuMu_Detached']()]
    streams = [Stream(lines=lines, routing_bit=99)]
    return Streams(streams=streams)

from typing import Callable
def make_streams(real_make_streams: Callable = _make_streams) -> Streams:
    from Hlt2Conf.settings.hlt2_binds import config_pp_2024
    with config_pp_2024():
        return real_make_streams()

def main(options: Options):
    with reconstruction.bind(from_file=False),\
         get_default_hlt1_filter_code_for_hlt2.bind(code=""):
        from Moore.production import hlt2
        options.lines_maker = make_streams
        hlt2(options)

using predefined settings + regex in production.hlt2 (lhcb-datapkg/AnalysisProductions!1111 (closed) recommends...)

def _make_regex():
    """Use some QEE lines for the sake of example."""
    # Here we 'brute-force' this regex via a regex of each individual line name we want to consider.
    # A neater format is possible via knowing how the lines are named but this feels almost less robust.
    lnames = ["Hlt2QEE_JpsiToMuMu_Detached"]
    return f"({'|'.join(lnames)})"


def main(options: Options):
    with reconstruction.bind(from_file=False),\
         get_default_hlt1_filter_code_for_hlt2.bind(code=""):
        from Moore.production import hlt2
        # Rather than processing via Moore:run_moore, using Moore.production:hlt2 emulates MC centralised processing.
        hlt2_extra_args = [
            '--settings=hlt2_pp_2024',  # Decides which settings to create the streams from
            f'--lines-regex={_make_regex()}',  # Filters the settings to only run the lines you wish
        ]
        # instantiates public_tools and run_moore 
        hlt2(options, *hlt2_extra_args)

I'm afraid the underlying mechanics of the truth information still remains somewhat beyond me, but this strikes me as quite concerning.

Minimal Reproducer

See: /eos/lhcb/user/l/lugrazet/public/TruthVarIssue/readme.md for a few scripts and commands

Edited Aug 20, 2024 by Luke Grazette