Skip to content

Lbexec improvements

Chris Burr requested to merge lbexec-improvements into master

This merge request includes improvements to lbexec based on discussions I've had with several people in follow up to #198 (closed). It also includes qmtexec (!3697 (merged)), if that MR is reviewed before this one I'll rebase.

For a refresher on lbexec, see the first few slides here.

Allow paths to files as well as module names

lbexec path/to/lbexec_example.py:check_output_file options.yaml

Why? It works okay if you're in the current directy but playing with PYTHONPATH is a pain if you're in a different directory.

Allow the options.yaml data to be defined within Python

While both arguments to lbexec are still mandatory the second one can now be specified similarly to the function spec, i.e. if you have a file named lbexec_example.py:

from GaudiConf.LbExec import Options
from PyConf.application import configure_input

test_options = {
    "simulation": True,
    "data_type": "Upgrade",
    "evt_max": 0,
    # This can be a list or a string. If it is a string, local file globbing and brace expansion is applied.
    "input_files": "root://eoslhcb.cern.ch//eos/lhcb/grid/prod/lhcb/MC/Upgrade/LDST/00076720/0000/00076720_000000{01,02,04,27,36,37,38,39,43,51,57,68}_1.ldst",
}


def check_input_files(options: Options):
    assert options.input_files == [
        'root://eoslhcb.cern.ch//eos/lhcb/grid/prod/lhcb/MC/Upgrade/LDST/00076720/0000/00076720_00000002_1.ldst',
        'root://eoslhcb.cern.ch//eos/lhcb/grid/prod/lhcb/MC/Upgrade/LDST/00076720/0000/00076720_00000004_1.ldst',
        'root://eoslhcb.cern.ch//eos/lhcb/grid/prod/lhcb/MC/Upgrade/LDST/00076720/0000/00076720_00000043_1.ldst',
        'root://eoslhcb.cern.ch//eos/lhcb/grid/prod/lhcb/MC/Upgrade/LDST/00076720/0000/00076720_00000068_1.ldst',
    ]
    return configure_input(options)

The test_options dictionary can be used to define the data put in to the Options object instead of providing a separate YAML file. Submitting grid jobs (both DIRAC and Ganga) will require this data to be copied elsewhere:

# These are all equivilent
lbexec lbexec_example:check_output_file lbexec_example:test_options
# If no module is specified before the `:` it uses the same one as the function
lbexec lbexec_example:check_output_file :test_options
# Paths can also be used
lbexec path/to/lbexec_example.py:check_output_file :test_options
lbexec path/to/lbexec_example.py:check_output_file path/to/lbexec_example.py:test_options

Why? This makes writing some tests and examples much cleaner. For user documentation I think we should avoid being too clever with generating the test_options dictionary to make it easier to transition between local and bulk/grid running.

Use type hints to detect current application

Replace:

from DaVinci import make_config

def my_function(options):
    return make_config(options, ...)

with (replacing DaVinci with Moore/GaudiConf.LbExec as needed):

from DaVinci import Options, make_config

def my_function(options: Options):
    return make_config(options, ...)

Why? Currenly lbexec relies on the GAUDIAPPNAME environment variable however this is fragile. It also has issues with some planned developments (e.g. monolithic build with all projects combined) and would be problematic if Moore/DaVinci are made to depend on each other as it's then ambiguous for project you want to use.

Merge request reports