Skip to content

Identical datahandles not identical - selective persistency crash when streaming

Follow up from !4682 (merged)

Short summary:

  • When using SP functionality (saving subsets of PersitReco locations) Moore is crashing when using production settings with streaming.
  • Example of crash:
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Hlt2Conf/tests/options/bandwidth/hlt2_bandwidth_production_streams.py", line 102, in <module>
    config = run_moore(options, make_main_streams, public_tools=public_tools)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/config.py", line 299, in run_moore
    write_streams_attributes_to_json(
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/config.py", line 296, in write_streams_attributes_to_json
    json.dump(streams.get_streams_attributes(), f)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/streams.py", line 201, in get_streams_attributes
    stream_attributes = {
                        ^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/streams.py", line 202, in <dictcomp>
    stream.name: stream.get_line_attributes() for stream in self.streams
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/streams.py", line 118, in get_line_attributes
    line_attributes = {line.name: line.to_dict() for line in self.physics_lines}
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/streams.py", line 118, in <dictcomp>
    line_attributes = {line.name: line.to_dict() for line in self.physics_lines}
                                  ^^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/lines.py", line 566, in to_dict
    return self.__dict__
           ^^^^^^^^^^^^^
  File "/lzufs/user/msaur/workdir/stack_master/Moore/Hlt/Moore/python/Moore/lines.py", line 545, in __dict__
    raise ValueError(
ValueError: extra_output ('', DataHandle('/Event/Rec/Calo/Electrons')) for line Hlt2Charm_DpDspToKsPip_LL is not supported in current persistency framework and will not be persisted
  • Running over single stream with options.write_streams_attributes_to_json = True is safe
  • Currently mitigated (hack) by !4682 (merged)
  • Linked to location/datahandle having different producers

Detailed comment by @graven:

so to be a bit more verbose: depending on the configuration, it is normal that a given TES location could be produced in multiple ways (eg. 'create it from scratch' vs. 'read it back from file') -- but there should be one point where that choice is made. That way, anything 'downstream' of this point should get the same answer when asking for this location.

What seems to be happening is that there are multiple ways of producing a location, and those are 'not aware' of each other -- i.e. one bit of code could call say read_it() to get a handle, and another could call say make_it() to get a handle, and those will result in different producers -- with the idea that either read_it() or make_it() should be called within a given configuration -- but then that assumption is broken.

The typical solution is to 'unify' read_it() and make_it() so that there is only one implementation, but which can be configured to either do what read_it() did or do what make_it() did.

Probably something like this has been done already, but that 'single' entry point (say get_it()) probably uses some utilities functions (the underlying read_it() and make_it()) depending on its configuration, and some code bypasses the 'single' entry point (i.e. it should be using get_it()), and instead directly talks to the actual implementation behind that entry point (either make_it() or read_it()) -- and then isn't aware of the switch in configuration, and which point both make_it() and read_it() end up being called.

So the problem is some bit of code using read_it() or make_it() directly instead of asking get_it() -- basically read_it() and make_it() should be 'hidden' by eg. turning them into __read_it() and __make_it() to signify that they should only be used 'if you know what you are doing'.

and @graven on how to proceed further:

the problem with the current 'hotfix' (aka 'hack') is that it eliminates the symptom, which will make it harder to diagnose the underlying problem. It also eliminates only one specific instance of the problem, without knowing if it is the only case where it happens.

So while a short-term hack would be OK, I would prefer something more along the lines of the suggestion I made above, which is to modify __eq__ and (at least) add a warning every time two handles are compared with the same location, but different producer. I would even be happy if such a comparison would throw an exception, and then that would be caught in the place where it originates, and then it makes some choice between the two almost-but-not-quite equal handles.

Or if that is making things too complicated, at least any hack should report it is doing something 'unusual' and point out that there is a potential problem. Completely hiding the problem behind some heuristic will only make it more complicated to debug this properly in the future.

FYI @nskidmor @graven @mveghel @rmatev @lugrazet

Edited by Miroslav Saur