add_handler_remove_hashcode

requested review from @msaur

assigned to @xueting

added 1 commit

3a83b1f9 - Update RootFilePatternHandler.py

@msaur is it understood where are these hashes coming from? Is it the new histograms service in Gaudi which appends them to names/paths automatically to avoid problems when running in concurrent/multithreaded mode? Is it possible to simply configure the underlying application not to have these hashes in the final output? If not, will it not be better to have a generic handler which would rename the paths, cutting out the hashes from them (in this case it can be reused by other apps, such as Gauss and Boole, instead of having something RTA-specific)?

I fully agree that some general handler which would do the parsing and/or renaming would be better, one could pass list of expected string as an input to the algorithm and them jut re-use it anywhere as needed.

I still haven't managed to have a chat with Sebastien, but in fact I think we could ask him directly here. @sponce here we are dealing with the hashes in name of algorithms which then being saved as a path in the output ROOT file. When configuration changes, which triggers change of hashes, also the paths will change which makes problematic to maintain stability of our monitoring tasks.

To give some specific example, we are using PrUpgradeChecking.py which then calls (among many others) PrChecker.cpp which saves histograms for various track and particle categories in paths like Track/BestLongTrackChecker_{hash}/BestLong/

So we are trying to deal with this issue and @dpopov raised a point if it is possible to change configuration so the final file will not save hashes in paths or that is more problematic.

After a quick discussion with @dovombru on WP3 meeting, it seems any general solution within the configuration is rather difficult and would require quite some time, so I would go forward with having a generic handler which could work for all test in LHCbPR.

@xueting I wonder if your new handler could be modified that list of correct paths would be always passed by a handler which needs to rename some paths. So we would have a generic handler which is able to save a new file with a correct paths which would be configured by those handlers which need it. @dpopov Better ideas?

Is passing a list of names necessary? From what I see at a glance -- the hash is always eight symbols long, and is added as a trailing suffix to the directory name, separated (from the algorithm name?) by an underscore. We can double check in Gaudi(?) code how the hash is calculated, if its length is fixed or variable, but it seems it is rather trivial to locate and remove it, unless I miss something.

I think it makes sense indeed to remove it. This is the solution that was adopted for the Allen online monitoring (see this comment). The only thing to be careful about is that one algorithm is not configured twice with different conditions, both of them producing output that would be relevant for the dashboard. In this case only one of the outputs would be produced.

Ah-ha, here we go, so without the hash suffixes there can be duplicate names?

In our case it should not happen as we are running on a single thread, but generally speaking indeed that can happen.

I am not sure what are the plans for running LHCb applications in MT mode, but if in general case this may happen, then it is better to foresee in the code. It does not sound complicated: if a directory with such algorithm name already exists, then the duplicate is stored with some suffix with an counter _variant2, or something like that (counter would make sense if there may be more than one duplicates)?

This is not linked to multi-threading, but rather a question of configuration. One algorithm can be configured twice with different properties or different input... in this case it is important to distinguish between the two instantiations.

Ah, ok, so it is just to distinguish between multiple copies of the same algorithms, but which may have different configurations, I see. This does not change the result .. however.. now I start wondering: is there a way to distinguish between these algorithms only by the contents of the output ROOT file? If they appear in random order -- they will be renamed in random order, and there will be consistency in names between the different results. Do we care about this or not?

I think for most cases of the efficiency plots there is only one algorithm (configured once) producing the output, so it should not be a problem.

As discussed with @dpopov we may go forward with just removing _{hash} from the name of folders and then saving a new file. Naming scheme looks stables so it should be simple to create a generic code which will iterate over all folder and remove _{hash} suffix.

In case of more than once instance of one folder then it can be just renamed as folder_1.

@xueting Can you take a look on this?

_folderX or better _variantX where X ∈ 1, 2, 3 ... N

_variantX sounds best to me.

@xueting Any progress on this MR?

I'm sorry, last week I went on vacation. I'll handle this MR soon.

@xueting Is there any problem preventing finishing this MR? If there are technical problems then let us know, resolving problems with the hashes is urgent as large portion of our monitoring will not work until this is resolved.

Sorry for the delay. A new commit is added to solve this issue. Has it met our expectations?

Thanks for the update @xueting. Can you pass original and modified ROOT files to some public space (eos) so changes could be checked?

Sure, I've put the original and modified files to /eos/lhcb/user/x/xueting/RTA/hash_code

Checking the files at /eos/lhcb/user/x/xueting/RTA/hash_code, for original file I see:

Track/TrackResCheckerBestLong/Long

where in the modified file we have:

Track/TrackResCheckerBestLong/Long_variant2

Similar is happening for quite a few folders as well. This seems to be some byproduct of the required changes, in the moment there is no hash then no change should happen at all. I guess that in the moment we will just rewrite original file this could be simpler.

Thank you for your careful review. I'll follow your suggestion.

In the new commit, this issue has been solved. It happens because that there are identically named subfolders within different parent folders. I changed the way of comparing folder names, and now this problem no longer exists. You can take a look at /eos/lhcb/user/x/xueting/RTA/hash_code to see the newly modified file. Besides, I slightly modified the code. If both input_file_path and output_file_path are provided, modified file will be saved in output_file_path;If only input_file_path is provided, modified file will replace the old file.

Thanks, output file now looks OK. I think this handler is ready to be merged and deployed.

@dpopov Could you take care of it? Thanks.

Ok, I'll try to take care of it today. Once it is merged and working we would like use it with active DC jobs?

Yes, all active DC jobs should use this handler. Thanks.

There is already a corresponding dashboard update: lhcb-core/LHCbPR2FE2!182

I see that PRSummaryChecker-s are using HashRemoveHandler internally .. so there is no need to add HashRemoveHandler to DC jobs?

Seems that @xueting was proactive and already implemented changes to efficiency handlers, however we still need this HashRemoveHandler to fix files which are used as the input for the dashboard (i.e. direct output of every job we are using). So it still needs to be added to each job we are running. Otherwise lhcb-core/LHCbPR2FE2!182 would not work and we would still have problem with showing plots on the dashboard.