Draft: Migrate PyConf to GaudiConfig2
This allows for the gains discussed in #141. It's also noticeable faster to execute the GaudiConfig2
-based configuration than the Configurables
-based one!
Some specific changes that were necessary:
- Migrate the
DDDBConf
andCondDB
user configurables over to aGaudiConfig2
-based helper method inPyConf.database
. - Migrate
IOHelper
toGaudiConfig2
-based helper method. andPyConf.io
. - Introduce a
PyConf.Services
importer which wraps 'service configurables'.- The importer doesn't actually wrap the import objects; I'm not convinced it's necessary to have this importer but it does make it less confusing rather than having to import configurables from two different places (
PyConf.X
andGaudiConfig2.Configurables
).
- The importer doesn't actually wrap the import objects; I'm not convinced it's necessary to have this importer but it does make it less confusing rather than having to import configurables from two different places (
As a result PyConf.components.setup_component
has been removed. I also addressed #127 (closed) as I was touching PyConf a lot anyway.
Obviously this looks like a big change but the goal is to produce exactly the same configuration, modulo some names and locations. I've checked this for the example.py
file with an without input, but I would be more comfortable checking against a larger application, i.e. Moore. Of course that means a lot more work, but we need to transition Moore before merging this anyway.
To do
-
Fix pickle deserialisation for diff test. - Requires gaudi/Gaudi#196 (closed).
-
Decide on how to migrate 'traditional' options files (those not import
'able).- As a stop-gap, to prevent us needing to convert every Moore options file, we could introduce a wrapper which accepts a
GaudiConfig2
-based configuration dict and instantiates oldConfigurable
objects based on that. - gaudi/Gaudi#192 (closed) might resolve this.
- As a stop-gap, to prevent us needing to convert every Moore options file, we could introduce a wrapper which accepts a
-
Decide on a clean merge_config
strategy for certain properties (particularlyApplicationMgr.ExtSvc
). -
AFAIK the DD4Hep configuration is not tested, as it's behind a 'feature flag' which no test enables. -
Handle IOVLock in dataflow_config.apply
. -
Migrate from underscore-based imports ( Gaudi__SomeAlg
) to package based ones (Gaudi.SomeAlg
).-
GaudiConfig2.Configurables
supports the latter, but I added support in the PyConf importers for the former so that I wouldn't have to change all the imported names (being lazy).
-
Merge request reports
Activity
added Configuration label
771 772 def merge_configs(*configs): 773 result = {} 774 for config in configs: 775 for name in config: 776 if name in result: 777 if name == "ApplicationMgr/ApplicationMgr": 778 merge_applicationmgrs(result[name], config[name]) 779 else: 780 result[name].merge(config[name]) 781 else: 782 result[name] = config[name] 783 return result 784 785 786 def merge_applicationmgrs(left, right): @clemenci this is the awful hack I came up with to support merging
ApplicationMgr.ExtSvc
lists. The semantics I want are that of 'set union' but I couldn't figure out how to define this and set it to thevalue_semantics
of that property.I think one can alter the C++ to enable this in principle, but perhaps we can consider these ad-hoc semantic changes also as a property of the merging? e.g. allowing something like
class UnionSemantics(...): # impl here pass # Define 'overriding' semantics for this merge merge_configs(*configs, semantics={ApplicationMgr.ExtSvc: UnionSemantics})
Custom semantics for specific properties are set in C++ via an extra argument to the property constructor. See https://gitlab.cern.ch/gaudi/Gaudi/-/blob/v36r0/GaudiCoreSvc/src/ApplicationMgr/ApplicationMgr.h#L245-246, where
ExtSvc
is astd::vector<std::string>
, but behaves as avector<Service>
(i.e. a each element of the vector must be a configurable for a service). You could, for example, change the semantic to OrderedSet, which concatenates the lists avoiding duplicates... I just realized that that implementation is wrong and we should merge the elements instead of skipping thos already present.I like the idea of the override of the semantics for a specific merge invocation.
Something else we can do (I think ATLAS is doing it) is to add custom semantics to
GaudiConfig2.semantics.SEMANTICS
(but that only consider property types and not the owner type), or do something likeApplicationMgr._descriptors["ExtSvc"].semantics = MySpecialSemantics(cpp_type)
- Resolved by Rosen Matev
- Resolved by Alex Pearce
- Resolved by Alex Pearce
598 593 INITIAL_TIME = 1433509200 599 594 600 595 options.finalize() 601 config = ComponentConfig() 596 # TODO(AP) how does one decide whether a service should also be added to the 597 # app mgr's extsvc list? @clemenci I'd love to have this answered! I haven't been able to figure this out, other than 'because the application fails to initialise otherwise'
Is it unsafe to just stick all instantiated services intoExtSvc
?ExtSvc
has two purposes:- create services that are not explicitly requested by other components (e.g. optional)
- map service names to service types (e.g.
EvtDataSvc/EventDataSvc
andDetDataSvc/DetectorDataSvc
)
IMHO the best option is to add there all the services that are configured, assuming you do not configure services that you do not want to use.
This is trickier than I thought. Because the application manager explicit creates and destroys each
ExtSvc
, there seem to be some specific ordering requirements. I don't know exactly what these are but blindly appending to aservices
list and settingExtSvc
to this can cause problems and initialisation and at finalisation (I got a seg. fault ).
55 55 type_, name = type_name 56 IOV = conf.pop(self.iovlockkey, False) # FIXME 57 configurable = setup_component( 58 type_, instance_name=name, require_IOVLock=IOV, **conf) 59 if _is_configurable_algorithm(type_): 56 57 # TODO(AP) the logic below mimics setup_component, but that only 58 # added the IOV lock if the input `type` was a string; was that 59 # intentional? if so, we don't need this because setup_component was 60 # only ever called with a string for services (all algorithms were 61 # instantiated by passing their Configurable type) 62 # but we still have to pop off this property because it's 'magic' 63 # and not a configurable property 64 requires_iov_lock = conf.pop(self.iovlockkey, False) 65 # if requires_iov_lock and "ExtraInputs" in type_.getDefaultProperties(): 66 # conf["ExtraInputs"] = conf.get("ExtraInputs", []) + [iov_lock_location] Every time I come back to look at the IOVLock-related configuration, I forget why it's there
The
PyConf.components.Algorithm
wrapper requires the IOV lock by default, so I guess it's important that algorithms do indeed have it. Or maybe it isn't? Idk.It is (partly) there to solve a chicken-and-egg problem. In order to get the right conditions, one needs to know the event time (or, in run3, the run number). But you only get the run number after you start processing the event, and have decoded the ODIN bank. So there needs to be a 'split' between those algorithms which need conditions and those that don't -- and before crossing that boundary, the framework has to insure it has the right conditions, and make sure that those conditions 'hang around' until the processing of the event is finished. So this boundary is implemented by making an algorithm which creates the 'IOVLock', and that algorithm takes the ODIN bank as input, makes sure the valid conditions are available (which may be lazy!) and then it flags that it did its work by writing the 'IOVLock' into the TES - and the destruction at the end of the event of the 'IOVLock' then flags that the conditions used by this event are no longer required.
Now, the next step is to properly flag which algorithms depend on the IOVLock and which don't -- eg. it is important that the ODIN decoder does not have a dependency on the IOVLock, as otherwise there would be a deadlock due to a circular dependency. In principle this dependency can be recognised by the use of a Conditions Accessor handle but that may not cover 'old' code, so the decision was made to explicitly make everything depend on the IOVLock, except for those algorithms that clearly cannot, such as the ODIN decoder... Ideally this 'whitelisting' should be automated, but in practice making a 'blanket' decision to default everything to depending on the IOVLock doesn't seem to have a large performance impact, and hence automating this (and verifying that the automation does the right thing all the time!) has not been a high priority...
"right conditions" = "valid instance of any conditions for the run number being processed" -- and it is the producer of 'Interval of Validity Lock' which makes sure that any subsequent requested conditions are valid and available (potentially lazy, i.e. on demand) for the run in question (which it knows, because this producer consumes ODIN). So the IOVLock dependency imposes a constraint on the order in which algorithms are allowed to run, and hence it (implicitly) affects the configuration of the scheduler.
Edited by Gerhard RavenThanks a lot for the explanation @graven.
Based on that, it sounds like most algorithms, say every HLT2 'selection algorithm' like a filter or combiner, should have a dependency on the IOV lock? Am checking because the configuration dumps shows that this is currently not the case. (Which surprised me; reading the code I thought it would be.)
yes, it should -- basically, only algorithms of which one is sure that they never (even implicitly!) use conditions could be 'whitelisted' -- and even then of those, only a few really have to be whitelisted (basically the
ODIN
decoder, and anything it depends on, such as the algorithm that puts theRawEvent
onto the TES).With the current scheduling, there is little to be gained in adding to this white list -- which would be different in case we would eg. use the
AvalancheScheduler
which runs, for every single event, algorithms on multiple threads in parallel 'as soon as possible', as in that case, you really want as many algorithms as possible available for the scheduler to give it more choices -- but as we don't use theAvalancheScheduler
this is not something we have to worry about.
mentioned in issue gaudi/Gaudi#196 (closed)
mentioned in merge request gaudi/Gaudi!1230 (merged)
mentioned in merge request Moore!911 (closed)
added 50 commits
-
c8fcd584...fb815659 - 46 commits from branch
master
- 89eb49dd - save.
- df1dc9a3 - Support reading files.
- 67175096 - Migrate tests.
- 79a88c36 - Remove cachetools.
Toggle commit list-
c8fcd584...fb815659 - 46 commits from branch
mentioned in issue #141
added RTA label