Skip to content

Introduced infrastructure to read ROOT files in algorithms (allowing MT)

Sebastien Ponce requested to merge sponce_IOAlg into master

Part of set of MR !2993 Rec!2735 Allen!1370 Moore!927 DaVinci!1002 Panoptes!304 Alignment!439 MooreOnline!325 lhcb-datapkg/PRConfig!388 LHCbIntegrationTests!67

Depends on gaudi/Gaudi!1197 (merged) gaudi/Gaudi!1523 (merged) gaudi/Gaudi!1526 (merged)

This also allows to get rid of the ancient Gaudi way of reading input in favor of Producer algorithms and introduces multithreading reads for all ROOT files (MDF case was already tackled)

The new algorithm, RootIOAlg, can handle any number of outputs, dynamically creating them from the list of branches to be read from the orignal file. For each branch, the name of the associated property has to be given so that consumer algorithms of these data can name them.

One can find a fully working example in the test rootioalg, where RootIOAlg is directly used. This is not the way it should be handled by end users though. For the impatient ones, the code to read a single item, namely here the RawEvent looks like this :

qualifiers = test_file_db['MiniBrunel_2018_MinBias_FTv4_DIGI']
raw_event_location = '/Event/DAQ/RawEvent'
ioalg = RootIOAlg("RootIOAlg",
                  EventBufferLocation=raw_event_location + "Banks",
                  Input=qualifiers.filenames,
                  EventBranches=[("RawEventLocation", raw_event_location)],
                  BufferNbEvents=20,
                  NSkip=20)
countalg = CountBanks("CountBanks", RawEventLocation=raw_event_location)

There are two mising features with this approach :

  • you need to give all EventBranches in one go when creating RootIOAlg this would mean collecting the needed inputs from the whole set of algorithms ran before creating the RootIOAlg
  • the returned ioalg does not have the expected output DataHandles namely you cannot use ioalg.RawEventLocation in Countbanks

These problems have been dealt with at the PyConf level and end users should use input_from_root_file helper to deal with input. This method takes a property name and a location and returns the DataHandle to be used to address the associated output of the RootIOAlg algorithm in the back. PyConf will take care of instantiating a single RootIOAlg and collecting the different inputs. The configuration of the RootIOAlg can be handled via the PyConf options, namely :

  • ioalg_buffer_nb_events : number of events in each buffer, defaults to 20
  • input_files : list of input files
  • first_evt : number of events to skip at start, defaults to 0
  • mdf_ioalg_name: in case of MDF input, allows to choose between IOAlgFileRead (default) and IOAlgMemoryMap (or any user supplied one)
  • root_ioalg_name: in case of ROOT input, allows to choose between RootIOAlg (default) and RootIOAlgExt which will read all branches of the input file automatically. Do not use RootIOAlgExt unless needed, e.g. when CopyInputStream is used
  • root_ioalg_opts: allows to pass extra options to the RootIOAlg, e.g. IgnorePaths when using RootIOAlgExt. See properties of RootIOAlg for more details On top method set_input_and_conds_from_testfiledb still exist to help in case testfiledb is used and it will deal with the input_files

Here is a code sample :

options = ApplicationOptions()
options.set_input_and_conds_from_testfiledb('MiniBrunel_2018_MinBias_FTv4_DIGI')
options.first_evt = 20
options.evt_max = 40
config = configure_input(options)
raw_event_location = input_from_root_file("RawEventLocation", "_Event_DAQ_RawEvent.")
countalg = CountBanks("CountBanks", RawEventLocation=raw_event_location)
Edited by Sebastien Ponce

Merge request reports