adding --branches-of-interest flag to diff_root_files.py
This adds a flag to the diff_root_files.py to compare only specific branches to speed up the comparison process.
In a test comparing a file with itself, there's a huge time difference between comparing all branches and comparing only a single branch: comparing all branches takes 450s for 25 events and 900s for 50 evts, comparing a single branch only takes 33s for 25evts and 39s for 50evts.
I'd like to point out that you can only filter BRANCHES as they are retrieved here:
branches = [b.GetName() for b in tree.GetListOfBranches()]
The script reports if any branches are matching with the regex provided to the flag.
Merge request reports
Activity
assigned to @emoyse
requested review from @amete
Hey guys,
As I mentioned in our private exchange, let me repeat here for others, there are a lot of caveats.
diff-root
tries to be as input agnostic as possible, trying to read and compare things through pure ROOT functionality w/o any knowledge of the dictionaries etc. Therefore, if we're really after specific formats, such asxAODs
I think we can possibly find more custom solutions these days.In any case,
tree.GetListOfBranches()
returns you the list of branches. Depending on thesplit-level
of the branch, and whether or not you're interested in static or dynamic variables in thexAODs
, things might or might not appear there. For example, take Muons. In thexAODs
the payload is inMuonsAux.
. You can only access these guys object-by-object since thesplit-level
is 0 by default inmaster
(you have to read the entire muon to read its pt, eta, phi etc. - i.e. static variables). But the dynamic variables, such asptvarcone20
, will have their own branches asMuonsAuxDyn.ptvarcone20
. So the branches you'll retrieve w/ the above syntax in this case would be['MuonsAux.', 'MuonsAuxDyn.ptvarcone20']
. So, if you want to read only thept
of the muon in this new formalism, it won't work.The actual variables that we compare are a lot more than a few thousand and indeed the number that's reported by
diff-root
only reflects the branches if I'm not mistaken. I think the nomenclature is used interchangeably (leaves/branches) in many places but it really depends on how the file was produced etc.Having said all that, of course reading only a subset of the branches (and turning off those we don't care about) is much, much efficient (as long as we don't read too many). I think
diff-root
didn't support this since it was mainly used to compare files in their entirety to enforce frozen tier-0 policy in practice. Adding this would be nice but all I wanted to say is that people should be aware of the limitations (in general comparing two files can be a really tricky endeavor).Best, Serhan
PS : I'd say a core software meeting would be more fitting than SPOT for this topic.
PSS : It's also possible that the utilities are somewhat out-of-sync, not necessarily by design but due to a culmination of changes over the years. We should of course fix the issues that we bump into.
The core SW weekly agenda for this Thursday, Jan-18, is empty, so we can certainly schedule a discussion about this issue. Is it OK with everyone? @rlangenb shall I add you as a speaker?
@tsulaia sure I can quickly present the issue.
mentioned in merge request !40769 (merged)
I have posted Sebastien's links to the Minutes section of the agenda
added 1 commit
- 10c81b03 - change leaves-of-interest flag to branches-of-interest
added Tools master review-pending-level-1 labels
CI Result FAILURE (hash 10c81b03)Athena AthSimulation AthGeneration AnalysisBase AthAnalysis externals cmake make required tests optional tests Full details available on this CI monitor view
Athena: number of compilation errors 0, warnings 1
AthSimulation: number of compilation errors 0, warnings 1
AthGeneration: number of compilation errors 0, warnings 1
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 28612]just an FYI, but, the tool I was mentioning at the core s/w meeting (
groot
'sroot-diff
) is there:- http://go-hep.org/x/hep/groot/cmd/root-diff
- https://go-hep.org/dist/v0.27.0/root-diff-linux_amd64.exe
- https://go-hep.org/dist/v0.27.0/root-diff-windows_amd64.exe
- https://go-hep.org/dist/v0.27.0/root-diff-darwin_amd64.exe
interested parties can try it out (the linux version should work on any Linux/64b box, same for the other OS/arch binaries) and report. or, (better?) point me at a set of "known-to-be-same files" and "known-to-differ files", so I can test myself.
added review-user-action-required label and removed review-pending-level-1 label
added review-pending-level-1 label and removed review-user-action-required label
CI Result FAILURE (hash 10c81b03)Athena AthSimulation AthGeneration AnalysisBase AthAnalysis externals cmake make required tests optional tests Full details available on this CI monitor view
Athena: number of compilation errors 0, warnings 1
AthSimulation: number of compilation errors 0, warnings 1
AthGeneration: number of compilation errors 0, warnings 1
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 29183]added review-user-action-required label and removed review-pending-level-1 label
OverlayTier0Test_required-test fails with:
<class 'AttributeError'> File "/build2/ci-builds/master/Athena/build/Athena/x86_64-centos7-gcc8-opt/python/PyUtils/scripts/diff_root_files.py", line 477, in main ndiff = diff_tree(fold, fnew, args) File "/build2/ci-builds/master/Athena/build/Athena/x86_64-centos7-gcc8-opt/python/PyUtils/scripts/diff_root_files.py", line 239, in diff_tree AttributeError: 'Namespace' object has no attribute 'leaves_of_interest'
-- L1
Edited by Louis-Guillaume Gagnonadded 2 commits
added review-pending-level-1 label and removed review-user-action-required label
CI Result SUCCESS (hash d0c9bb3d)Athena AthSimulation AthGeneration AnalysisBase AthAnalysis externals cmake make required tests optional tests Full details available on this CI monitor view
Athena: number of compilation errors 0, warnings 0
AthSimulation: number of compilation errors 0, warnings 0
AthGeneration: number of compilation errors 0, warnings 0
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 29242]added review-approved label and removed review-pending-level-1 label
mentioned in commit b9befbb0
added sweep:ignore label