Externals Update, master branch (2021.07.02.)
Switched all projects to atlasexternals-2.0.110. The changes wrt. atlasexternals-2.0.109 are the following (atlasexternals@2.0.109...2.0.110):
- Made it possible to build ROOT with LLVM/Cling in Debug mode for dictionary debugging;
- I did this while debugging ATLASG-1602, which I unfortunately still didn't manage to properly understand...
- I did this while debugging ATLASG-1602, which I unfortunately still didn't manage to properly understand...
- Removed the build of FastJet from all of the LCG based projects;
- At the same time switched the build of FastJet in AnalysisBaseExternals to version 3.4.0, in sync with LCG_100_ATLAS_5;
- Some details about the FastJet updates can be found on SPI-1929;
- Upgraded lwtnn to version 2.12 (for GCC 11 compatibility, see ATLINFR-4147);
- Upgraded prmon to version 2.2.1 (for GCC 11 compatibility, see ATLINFR-4147);
- Removed some unnecessary printouts from the build of dictionaries and converters.
At the same time upgraded all LCG based releases (with the exception of DetCommon of course) to LCG_100_ATLAS_5. To complete the migration to the new FastJet version.
Merge request reports
Activity
added full-build full-unit-tests labels
This merge request affects 8 packages:
- Projects/AnalysisBase
- Projects/AthAnalysis
- Projects/AthDataQuality
- Projects/AthGeneration
- Projects/AthSimulation
- Projects/Athena
- Projects/DetCommon
- Projects/VP1Light
Affected files list will not be printed in this case
Adding @krumnack ,@jchapman ,@akraszna ,@rbianchi as watchers
added Build master review-pending-level-1 labels
CI Result FAILURE (hash 26fe3115)Athena AthSimulation AthGeneration AnalysisBase AthAnalysis DetCommon externals cmake make required tests optional tests Full details available on this CI monitor view
Athena: number of compilation errors 0, warnings 0
AthSimulation: number of compilation errors 0, warnings 0
AthGeneration: number of compilation errors 0, warnings 0
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
DetCommon: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 36349]So... The FastJet guys didn't quite make their code thread-safe, did they...
Well, at least not in the way that I think we were all expecting. See section 10.3 of http://fastjet.fr/repo/fastjet-doc-3.4.0.pdf for a description of what one needs to do for the parallel threads not to mess up each others' random number generators...
@delsart, didn't you test back in the day that the new FastJet version would produce reproducible results without changes from our side? Or did I misunderstand that?
We have 2 options at this point in my mind:
- We say $#@% it, and we go back to patching FastJet 3.4.0 to make the random number generator inside
fasjet::GhostedAreaSpec
be thread specific; - We re-write our code in the non-too-convenient way that the FastJet authors apparently came up with...
What does everybody think? (@elmsheus, @emoyse, @cdelitzs, @mhodgkin, @goetz, @wbalunas, @sawyer, @khoo, @ssnyder)
- We say $#@% it, and we go back to patching FastJet 3.4.0 to make the random number generator inside
Hi Attila,
I did test, but I had to change our code (basically using the area_def.with_fixed_seeds(seeds) construct as the authors recommend, I think we don't need to change how we use BackgroundEstimatorBase, but I'd need to re-check).
Adapting our code to a new version of fastjet, didn't look like an issue to me so I'm only remembering now about this little changes I did. Maybe I'm missing some difficulties about changing our own code in this case ?
It should be a little patch, I can send it directly to you, or put it on a branch on gitlab if you prefer.
(tagging @mswiatlo so he's also aware of the discussion).
Cheers,
P-A
Hi P-A,
If you already went through the exercise of adapting our code to the new FastJet interface once, then we should definitely just go with that. Yes, please send me your modifications in some form. If you'd like, I could give you modification rights to this branch in my repository. But if the changes are not excessively large, just sending them to me through some other means would be perfectly fine. (File attached to e-mail, modified package uploaded to AFS, pick your poison.
)Cheers, Attila
P.S. It took me some time to realise that 2 of the test failures were because of !44895 (merged)... So it seems that it's "just" the MT irreproducibility that's plaguing this MR. (Hopefully the timeout in one of the tests was just a glitch.)
added 28 commits
-
26fe3115...60e46f96 - 26 commits from branch
atlas:master
- 2f53e747 - Switched all projects to atlasexternals-2.0.110 and LCG_100_ATLAS_5.
- 895ac987 - Updated the code to make use of FastJet 3.4.0's thread safety.
-
26fe3115...60e46f96 - 26 commits from branch
This merge request affects 9 packages:
- Projects/AnalysisBase
- Projects/AthAnalysis
- Projects/AthDataQuality
- Projects/AthGeneration
- Projects/AthSimulation
- Projects/Athena
- Projects/DetCommon
- Projects/VP1Light
- Reconstruction/Jet/JetRec
Affected files list will not be printed in this case
Adding @jchapman ,@goetz ,@akraszna ,@krumnack ,@rbianchi as watchers
added JetEtmiss Reconstruction analysis-review-required labels
Hi P-A,
Unfortunately I was not able to apply your patch file as-is on top of the master branch. I guess you wrote it in 21.2 maybe...?
But the changes were easy enough to translate to the current state of master, so I just did that by hand. (I didn't touch EventDensityTool.cxx in the end. Your patch removed some debug messages from that file. Which I'm completely fine with. But it should not be done as part of this MR.) Making just some minor adjustments to how you wrote the code exactly...
I tested it locally, these seem to be enough to make the test succeed. (I checked that I was able to reproduce the difference wrt. the reference at first, which went away when applying these changes.) Even though I was a bit suspicious at first, as I saw that Reconstruction/Jet/JetRec/Root/FastJetInterfaceTool.cxx also "does some stuff" with random seeds. But apparently that was thread safe already.
Cheers, Attila
A slight pre-warning: the CI MR https://gitlab.cern.ch/atlas-sit/CI/-/merge_requests/239 went into too early and there might be a harmless q431xAODDigest CI failure like in http://atlas-computing.web.cern.ch/atlas-computing/links/distDirectory/ci/CIWebArea/nicos_web_areaMRCIbuilds64BC7G8AthenaOpt/NICOS_TestLog_MR-44914-2021-07-04-02-03/ReleaseTests___q431xAODDigestTest-test__q431xAODDigestTest-test__m.html
Interesiting... Jets are definitely still being produced as part of q431. I only checked that in my local test the file produced by https://gitlab.cern.ch/atlas-sit/CI/-/blob/master/TestConfig/q431xAODDigestTest.citest#L64 would say "No difference". I didn't run the CI script as-is. (I just copy-pasted the relevant lines into my terminal.)
Luckily the printouts from the test should make it clear if the job fails because of this, that it's expecting a different "No difference" output. But it's very good that you warned us!
I should have explained it better in the previous comment: the mentioned CI change should go in in parallel with !44380 (merged) which completely removes the jet/etmiss AOD content. So, luckily there are still jets in the AOD right now otherwise we would have not noticed the fastjet problem here.
Too many things happening in parallel.
CI Result FAILURE (hash 895ac987)Athena AthSimulation AthGeneration AnalysisBase AthAnalysis DetCommon externals cmake make required tests optional tests Full details available on this CI monitor view
Athena: number of compilation errors 0, warnings 0
AthSimulation: number of compilation errors 0, warnings 0
AthGeneration: number of compilation errors 0, warnings 0
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
DetCommon: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 36372]The Jet problem is gone.
The test is still marked as a failure, but this time because of:2c2 < No changes for: nTopo nIdTracks nMuons nElec nTrueElec nFakeElec nPhot nTruePhot nFakePhot --- > No changes for: nTopo nIdTracks nJets nMuons nElec nTrueElec nFakeElec nPhot nTruePhot nFakePhot ==================END q431 Digest Diff with exit code: 1 Sun Jul 4 19:58:16 CEST 2021 The xAODDigest diff failed! The output (>) differs from the reference (<) The q431 TEST WARNING! If differences are expected this is the patch to apply
However the HLT test timed out once again. Is that seen in other MRs as well, or is there something that I'm introducing here that halts the HLT execution?
The Trigger_athenaHLT-test also timed out independently in 2 other MRs earlier today: !44914 (merged) and !44380 (merged)
So from the CI perspective I'd say the MR here looks fine.
For the timeouts, see https://its.cern.ch/jira/browse/ATR-23767 in the CI Status Board
Let's maybe do it in the evening. Once it's pushed in, it would be good to clean all of the CI nodes. @aundrus, will you be able to launch such a cleaning this evening (CERN time)?
Without a cleaning the CI jobs will continue to pick up the old FastJet version from the local disk from under
AthenaExternals
. (Since the CI never removes any files, it can only overwrite them...)Edited by Attila Krasznahorkayremoved review-pending-level-1 label
added review-approved label
mentioned in commit abe4bea0
added sweep:ignore label
mentioned in merge request !44970 (closed)