For me this has lead to misconfigured algorithms (e.g. truth matching) and incorrect outputs in tuples, when running over HLT2 output, causing me to re-run things. Is there a reason the value is not extracted from options.process (or uses this as the default) or that it is not a positional argument?
If this field is important to have, is it not better to raise an error that the field is unset rather than assuming a default?
Many advance thanks!
Edited
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
Activity
Sort or filter
Newest first
Oldest first
Show all activity
Show comments only
Show history only
Ryunosuke O'Neilchanged title from (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process to (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?)
changed title from (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process to (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?)
Hey @roneil, can you be more "quantitive" with a reproducer? In fact, do you still have the issue with the HEAD and lbexec? Just so that we do not "fix" what is obsoleted. Thanks.
however lets say process=options.process is not set in the call to configured_MCTruthAndBkgCatAlg - easy mistake to make especially when going from previous Upgrade versions of DaVinci, where you did have options files that did invoke configured_MCTruthAndBkgCatAlg in this way.
Using this algorithm on hlt2 output without setting process='Hlt2' leads to a "silent failure", producing truth matching variables in the Funtuple that do not make sense (all my candidates have the same BKGCAT when they should not, and TRUEID == 0)
It should not be easy for analysts to fall into traps / assume sensible defaults are set (Spruce is not a sensible default!) when configuring their algorithms that would result in the wrong output being produced (it wastes time, and computing time)
I think it would be better if algorithms required process to be set explicitly by the user, and throw an exception if it is not set, or just take the value from options.process at runtime (that way the process comes from one place)
Furthermore I do not see the point in having options.process at all if you still need to set process in individual algorithms
I agree, the default should be process=Turbo. Spruce is not the most common use case
The process option should be addressed. Upstream there are 3 options for this,
Hlt2 - output straight out of Moore Hlt2 step - in actual data running this will not be used
Turbo - the default for datataking. Hlt2 output has been through Sprucing passthrough, like "Tesla"
Spruce - The sprucing has run in exclusive mode
For most DV applications like MCTruth above Hlt2 and Turbo will mean the same thing (pointing to a TES location at /Event/HLT2) but we should have consistency over the projects and have all 3 options
Thanks both. Yes, it should not be easy to fall into such traps, and any missing configuration should result in a clear failure or something else appropriate. I think we all agree. You found a "path" that is not fully tested and configured the best way.
@amathad and @pkoppenb should comment as more into the nitty gritty details (added as assignees already) but the 2 possible routes are (1) have the default process=Turbo and check all possible values in anything that uses this or (2) have no default and adapt any configuration related function so that process is never defaulted anywhere.
There are pros and cons for both routes, I guess, and route (2) is likely to require more inspection right now to ensure no defaults are given anywhere. In any case, whatever we do, we should have some more tests ...
Furthermore I do not see the point in having options.process at all if you still need to set process in individual algorithms
Right. You have to imagine that Python configuration helpers exist in many places and it is not difficult to go have process set somewhere where that's not the robust thing to do. It is understandable ;-).
Hi, Remember discussing about the default option on mattermost and had choosen to set it to Spruce thinking this was the common case (but Nicole says above the common case is Turbo).
I think in this "development phase", makes sense to settle on option 2 i.e. set default process=options.process and make this a required option to be set in the user yaml file. Can take care of the former, can @dfazzini handle the latter?
The actual culprit of the problem is of-course root_in_tes which is set with the process for MCTruth. Not sure why the underlying tools used in the algorithm does not fail and defaults to doing stupid stuff when root_in_tes is wrong.
Ryunosuke O'Neilchanged title from (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?) to process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?)
changed title from (v61r0) process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?) to process= keyword arg defaults in various algorithm configurations are not set to DaVinci.options.process (?)