Skip to content
Snippets Groups Projects

Draft: Migrate PyConf to GaudiConfig2

Closed Alex Pearce requested to merge apearce-pyconf-gaudiconfig2 into master

This allows for the gains discussed in #141. It's also noticeable faster to execute the GaudiConfig2-based configuration than the Configurables-based one! :rocket:

Some specific changes that were necessary:

  • Migrate the DDDBConf and CondDB user configurables over to a GaudiConfig2-based helper method in PyConf.database.
  • Migrate IOHelper to GaudiConfig2-based helper method. and PyConf.io.
  • Introduce a PyConf.Services importer which wraps 'service configurables'.
    • The importer doesn't actually wrap the import objects; I'm not convinced it's necessary to have this importer but it does make it less confusing rather than having to import configurables from two different places (PyConf.X and GaudiConfig2.Configurables).

As a result PyConf.components.setup_component has been removed. I also addressed #127 (closed) as I was touching PyConf a lot anyway.

Obviously this looks like a big change but the goal is to produce exactly the same configuration, modulo some names and locations. I've checked this for the example.py file with an without input, but I would be more comfortable checking against a larger application, i.e. Moore. Of course that means a lot more work, but we need to transition Moore before merging this anyway.

To do

  • Fix pickle deserialisation for diff test.
  • Decide on how to migrate 'traditional' options files (those not import'able).
    • As a stop-gap, to prevent us needing to convert every Moore options file, we could introduce a wrapper which accepts a GaudiConfig2-based configuration dict and instantiates old Configurable objects based on that.
    • gaudi/Gaudi#192 (closed) might resolve this.
  • Decide on a clean merge_config strategy for certain properties (particularly ApplicationMgr.ExtSvc).
  • AFAIK the DD4Hep configuration is not tested, as it's behind a 'feature flag' which no test enables.
  • Handle IOVLock in dataflow_config.apply.
  • Migrate from underscore-based imports (Gaudi__SomeAlg) to package based ones (Gaudi.SomeAlg).
    • GaudiConfig2.Configurables supports the latter, but I added support in the PyConf importers for the former so that I wouldn't have to change all the imported names (being lazy).
Edited by Alex Pearce

Merge request reports

Pipeline #2849023 failed

Pipeline failed for a22a1c17 on apearce-pyconf-gaudiconfig2

Closed by Sebastien PonceSebastien Ponce 2 years ago (May 11, 2022 1:09pm UTC)

Merge details

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
771
772 def merge_configs(*configs):
773 result = {}
774 for config in configs:
775 for name in config:
776 if name in result:
777 if name == "ApplicationMgr/ApplicationMgr":
778 merge_applicationmgrs(result[name], config[name])
779 else:
780 result[name].merge(config[name])
781 else:
782 result[name] = config[name]
783 return result
784
785
786 def merge_applicationmgrs(left, right):
  • Author Contributor

    @clemenci this is the awful hack I came up with to support merging ApplicationMgr.ExtSvc lists. The semantics I want are that of 'set union' but I couldn't figure out how to define this and set it to the value_semantics of that property.

    I think one can alter the C++ to enable this in principle, but perhaps we can consider these ad-hoc semantic changes also as a property of the merging? e.g. allowing something like

    class UnionSemantics(...):
        # impl here
        pass
    
    # Define 'overriding' semantics for this merge
    merge_configs(*configs, semantics={ApplicationMgr.ExtSvc: UnionSemantics})
  • Custom semantics for specific properties are set in C++ via an extra argument to the property constructor. See https://gitlab.cern.ch/gaudi/Gaudi/-/blob/v36r0/GaudiCoreSvc/src/ApplicationMgr/ApplicationMgr.h#L245-246, where ExtSvc is a std::vector<std::string>, but behaves as a vector<Service> (i.e. a each element of the vector must be a configurable for a service). You could, for example, change the semantic to OrderedSet, which concatenates the lists avoiding duplicates... I just realized that that implementation is wrong and we should merge the elements instead of skipping thos already present.

    I like the idea of the override of the semantics for a specific merge invocation.

    Something else we can do (I think ATLAS is doing it) is to add custom semantics to GaudiConfig2.semantics.SEMANTICS (but that only consider property types and not the owner type), or do something like

    ApplicationMgr._descriptors["ExtSvc"].semantics = MySpecialSemantics(cpp_type)
  • Please register or sign in to reply
  • Alex Pearce
  • Alex Pearce
  • Alex Pearce
  • 598 593 INITIAL_TIME = 1433509200
    599 594
    600 595 options.finalize()
    601 config = ComponentConfig()
    596 # TODO(AP) how does one decide whether a service should also be added to the
    597 # app mgr's extsvc list?
    • Author Contributor

      @clemenci I'd love to have this answered! I haven't been able to figure this out, other than 'because the application fails to initialise otherwise' :laughing: Is it unsafe to just stick all instantiated services into ExtSvc?

    • ExtSvc has two purposes:

      • create services that are not explicitly requested by other components (e.g. optional)
      • map service names to service types (e.g. EvtDataSvc/EventDataSvc and DetDataSvc/DetectorDataSvc)

      IMHO the best option is to add there all the services that are configured, assuming you do not configure services that you do not want to use.

    • Author Contributor

      :thumbsup: I'll do that, thanks.

    • Author Contributor

      This is trickier than I thought. Because the application manager explicit creates and destroys each ExtSvc, there seem to be some specific ordering requirements. I don't know exactly what these are but blindly appending to a services list and setting ExtSvc to this can cause problems and initialisation and at finalisation (I got a seg. fault :medal:).

    • Please register or sign in to reply
  • 55 55 type_, name = type_name
    56 IOV = conf.pop(self.iovlockkey, False) # FIXME
    57 configurable = setup_component(
    58 type_, instance_name=name, require_IOVLock=IOV, **conf)
    59 if _is_configurable_algorithm(type_):
    56
    57 # TODO(AP) the logic below mimics setup_component, but that only
    58 # added the IOV lock if the input `type` was a string; was that
    59 # intentional? if so, we don't need this because setup_component was
    60 # only ever called with a string for services (all algorithms were
    61 # instantiated by passing their Configurable type)
    62 # but we still have to pop off this property because it's 'magic'
    63 # and not a configurable property
    64 requires_iov_lock = conf.pop(self.iovlockkey, False)
    65 # if requires_iov_lock and "ExtraInputs" in type_.getDefaultProperties():
    66 # conf["ExtraInputs"] = conf.get("ExtraInputs", []) + [iov_lock_location]
    • Author Contributor

      Every time I come back to look at the IOVLock-related configuration, I forget why it's there :upside_down:

      The PyConf.components.Algorithm wrapper requires the IOV lock by default, so I guess it's important that algorithms do indeed have it. Or maybe it isn't? Idk.

    • It is (partly) there to solve a chicken-and-egg problem. In order to get the right conditions, one needs to know the event time (or, in run3, the run number). But you only get the run number after you start processing the event, and have decoded the ODIN bank. So there needs to be a 'split' between those algorithms which need conditions and those that don't -- and before crossing that boundary, the framework has to insure it has the right conditions, and make sure that those conditions 'hang around' until the processing of the event is finished. So this boundary is implemented by making an algorithm which creates the 'IOVLock', and that algorithm takes the ODIN bank as input, makes sure the valid conditions are available (which may be lazy!) and then it flags that it did its work by writing the 'IOVLock' into the TES - and the destruction at the end of the event of the 'IOVLock' then flags that the conditions used by this event are no longer required.

      Now, the next step is to properly flag which algorithms depend on the IOVLock and which don't -- eg. it is important that the ODIN decoder does not have a dependency on the IOVLock, as otherwise there would be a deadlock due to a circular dependency. In principle this dependency can be recognised by the use of a Conditions Accessor handle but that may not cover 'old' code, so the decision was made to explicitly make everything depend on the IOVLock, except for those algorithms that clearly cannot, such as the ODIN decoder... Ideally this 'whitelisting' should be automated, but in practice making a 'blanket' decision to default everything to depending on the IOVLock doesn't seem to have a large performance impact, and hence automating this (and verifying that the automation does the right thing all the time!) has not been a high priority...

    • For my understanding: what do you call the "right conditions"? Clearly the alignment for instance depends on time/run number. But that can potentially change several times during the processing of a file, so can't affect the configuration. What do you mean?

    • "right conditions" = "valid instance of any conditions for the run number being processed" -- and it is the producer of 'Interval of Validity Lock' which makes sure that any subsequent requested conditions are valid and available (potentially lazy, i.e. on demand) for the run in question (which it knows, because this producer consumes ODIN). So the IOVLock dependency imposes a constraint on the order in which algorithms are allowed to run, and hence it (implicitly) affects the configuration of the scheduler.

      Edited by Gerhard Raven
    • Author Contributor

      Thanks a lot for the explanation @graven.

      Based on that, it sounds like most algorithms, say every HLT2 'selection algorithm' like a filter or combiner, should have a dependency on the IOV lock? Am checking because the configuration dumps shows that this is currently not the case. (Which surprised me; reading the code I thought it would be.)

    • yes, it should -- basically, only algorithms of which one is sure that they never (even implicitly!) use conditions could be 'whitelisted' -- and even then of those, only a few really have to be whitelisted (basically the ODIN decoder, and anything it depends on, such as the algorithm that puts the RawEvent onto the TES).

      With the current scheduling, there is little to be gained in adding to this white list -- which would be different in case we would eg. use the AvalancheScheduler which runs, for every single event, algorithms on multiple threads in parallel 'as soon as possible', as in that case, you really want as many algorithms as possible available for the scheduler to give it more choices -- but as we don't use the AvalancheScheduler this is not something we have to worry about.

    • Please register or sign in to reply
  • Alex Pearce changed the description

    changed the description

  • Alex Pearce added 1 commit

    added 1 commit

    Compare with previous version

  • Alex Pearce changed the description

    changed the description

  • mentioned in merge request gaudi/Gaudi!1230 (merged)

  • Alex Pearce mentioned in merge request Moore!911 (closed)

    mentioned in merge request Moore!911 (closed)

  • Alex Pearce added 50 commits

    added 50 commits

    Compare with previous version

  • Alex Pearce marked the checklist item Fix pickle deserialisation for diff test. as completed

    marked the checklist item Fix pickle deserialisation for diff test. as completed

  • Alex Pearce added 1 commit

    added 1 commit

    • a22a1c17 - Remove pickle support from diffopts.

    Compare with previous version

  • Alex Pearce mentioned in issue #141

    mentioned in issue #141

  • Alex Pearce assigned to @rmatev and unassigned @apearce

    assigned to @rmatev and unassigned @apearce

  • added RTA label

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading