Scalar dispatch and benchmarks
The branch adds the following parameters in the TestEm3_GV and FullCMS examples:
- The basket activation options: config-vectorized-geom, config-vectorized-physics, config-vectorized-MSC and field-basketized take now not only 0-de-activated 1-activated, but also: 2-baketize but dispatch baskets in scalar mode
- config-monitoring 1 activates printing handler statistics: number of tracks handled, number of baskets processed in scalar mode/vector mode. The list of handlers is sorted in decreasing number of tracks handled.
There are new scripts in benchmarks/FullCMS and benchmarks/TestEm3 with name ending in _GV. These take as input the bench_GV.sh script from the same folder and should be run like:
FullCMS_GV.run bench_GV.sh
This will run the GeantV simulation corresponding to the settings in bench_GV.sh N times, printing only the final timings. To be used to benchmark different configurations of GeantV, specially with config_vectorized... options 0 (scalar) versus 2 (baskets w. scalar dispatch), but also 1 (vectorized)
Merge request reports
Activity
Jenkins Build FAILUREResults available at: Jenkins [GeantV-gitlab #631]
The AVX2+fms on slc6 faiilure is due to "illegal instruction", does not seem to be related to any of the modifications: http://cdash.cern.ch/viewTest.php?onlyfailed&buildid=596360
I see the trace below, so the failure might be related to the change of types done in the models for the scalar build. @pcanal can you have a look?
#4 #5 0x00007ffbbf0fe014 in _GLOBAL__sub_I_SauterGavrilaPhotoElectricModel.cc () from /var/build/jenkins/workspace/GeantV-continuous/BACKEND/avx2+fma/BUILDTYPE/Release/COMPILER/gcc62/LABEL/slc6/OPTION/NONE/build/lib/libRealPhysics.so
On the failing machine, I can reproduce the problem and it is seemingly 'very' old. With the following commit:
commit 81162b178a4072347caf8a3803615bbcc649091c Author: kumawat <kr@pcphsft110.dyndns.cern.ch> Date: Mon Apr 23 16:56:46 2018 +0200
the failure is the same.
Note that the failure appears solely with Release build and the failure is different with RelWithDebInfo.
With the release build it fails in:
#0 _GLOBAL__sub_I_SauterGavrilaPhotoElectricModel.cc () at /cvmfs/sft-nightlies.cern.ch/lcg/views/devgeantv/Wed/x86_64+avx2+fma-slc6-gcc62-opt/include/Vc/version.h:115
which is the wrapper implementing:
112 static struct runLibraryAbiCheck { 113 runLibraryAbiCheck() 114 { 115 checkLibraryAbi(Vc_LIBRARY_ABI_VERSION, Vc_VERSION_NUMBER, Vc_VERSION_STRING); 116 } 117 } _runLibraryAbiCheck;
and the assembly code is:
Dump of assembler code for function _GLOBAL__sub_I_SauterGavrilaPhotoElectricModel.cc: 0x00007ffff2fd4fa0 <+0>: 55 push %rbp 0x00007ffff2fd4fa1 <+1>: 48 8d 15 b8 dc 0f 00 lea 0xfdcb8(%rip),%rdx # 0x7ffff30d2c60 0x00007ffff2fd4fa8 <+8>: be 06 03 01 00 mov $0x10306,%esi 0x00007ffff2fd4fad <+13>: bf 05 00 00 00 mov $0x5,%edi 0x00007ffff2fd4fb2 <+18>: 48 89 e5 mov %rsp,%rbp 0x00007ffff2fd4fb5 <+21>: 41 55 push %r13 0x00007ffff2fd4fb7 <+23>: 48 83 ec 08 sub $0x8,%rsp 0x00007ffff2fd4fbb <+27>: e8 40 a0 ff ff callq 0x7ffff2fcf000 <_ZN4Vc_16Common15checkLibraryAbiEjjPKc@plt> 0x00007ffff2fd4fc0 <+32>: 48 8d 3d 1a ff 33 00 lea 0x33ff1a(%rip),%rdi # 0x7ffff3314ee1 <_ZStL8__ioinit> 0x00007ffff2fd4fc7 <+39>: e8 c4 c8 ff ff callq 0x7ffff2fd1890 <_ZNSt8ios_base4InitC1Ev@plt> 0x00007ffff2fd4fcc <+44>: 48 8b 3d 15 7b 33 00 mov 0x337b15(%rip),%rdi # 0x7ffff330cae8 0x00007ffff2fd4fd3 <+51>: 48 8d 15 e6 97 33 00 lea 0x3397e6(%rip),%rdx # 0x7ffff330e7c0 <__dso_handle> 0x00007ffff2fd4fda <+58>: 48 8d 35 00 ff 33 00 lea 0x33ff00(%rip),%rsi # 0x7ffff3314ee1 <_ZStL8__ioinit> 0x00007ffff2fd4fe1 <+65>: e8 4a ae ff ff callq 0x7ffff2fcfe30 <__cxa_atexit@plt> 0x00007ffff2fd4fe6 <+70>: 48 8d 3d f3 fe 33 00 lea 0x33fef3(%rip),%rdi # 0x7ffff3314ee0 <_ZL13gVersionCheck> 0x00007ffff2fd4fed <+77>: be 06 0c 06 00 mov $0x60c06,%esi 0x00007ffff2fd4ff2 <+82>: e8 99 b7 ff ff callq 0x7ffff2fd0790 <_ZN13TVersionCheckC1Ei@plt> 0x00007ffff2fd4ff7 <+87>: 48 8b 05 1a 7d 33 00 mov 0x337d1a(%rip),%rax # 0x7ffff330cd18 0x00007ffff2fd4ffe <+94>: c5 f9 ef c0 vpxor %xmm0,%xmm0,%xmm0 0x00007ffff2fd5002 <+98>: 48 8d 90 60 09 00 00 lea 0x960(%rax),%rdx 0x00007ffff2fd5009 <+105>: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0x00007ffff2fd5010 <+112>: c5 f8 11 00 vmovups %xmm0,(%rax) => 0x00007ffff2fd5014 <+116>: c4 e3 7d 39 (bad) 0x00007ffff2fd5018 <+120>: 40 10 01 adc %al,(%rcx) 0x00007ffff2fd501b <+123>: 48 83 c0 60 add $0x60,%rax 0x00007ffff2fd501f <+127>: c5 f8 11 40 c0 vmovups %xmm0,-0x40(%rax) 0x00007ffff2fd5024 <+132>: c4 e3 7d 39 (bad) 0x00007ffff2fd5028 <+136>: 40 d0 01 rex rolb (%rcx) 0x00007ffff2fd502b <+139>: c5 f8 11 40 e0 vmovups %xmm0,-0x20(%rax) 0x00007ffff2fd5030 <+144>: c4 e3 7d 39 (bad) 0x00007ffff2fd5034 <+148>: 40 rex 0x00007ffff2fd5035 <+149>: f0 01 48 39 lock add %ecx,0x39(%rax) 0x00007ffff2fd5039 <+153>: c2 75 d4 retq $0xd475 0x00007ffff2fd503c <+156>: 48 8d 15 7d 97 33 00 lea 0x33977d(%rip),%rdx # 0x7ffff330e7c0 <__dso_handle> 0x00007ffff2fd5043 <+163>: 48 8d 3d 86 01 09 00 lea 0x90186(%rip),%rdi # 0x7ffff30651d0 <__tcf_0> 0x00007ffff2fd504a <+170>: 31 f6 xor %esi,%esi 0x00007ffff2fd504c <+172>: c5 f8 77 vzeroupper 0x00007ffff2fd504f <+175>: 48 83 c4 08 add $0x8,%rsp 0x00007ffff2fd5053 <+179>: 41 5d pop %r13 0x00007ffff2fd5055 <+181>: 5d pop %rbp 0x00007ffff2fd5056 <+182>: e9 d5 ad ff ff jmpq 0x7ffff2fcfe30 <__cxa_atexit@plt> End of assembler dump.
where obviously the 'bad' opcodes are the problem. The machine does not have valgrind installed, so I have not verified it is not due to a memory over-write (but it is unlikely).
The failing node has:
[sftnight@p01001533x71310 geant-manual]$ g++ --version g++ (GCC) 6.2.0 Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [sftnight@p01001533x71310 geant-manual]$ cat /etc/redhat-release Scientific Linux CERN SLC release 6.10 (Carbon)
-- COMPILATION FLAGS ARE - -fabi-version=0 -mavx2 -pipe -m64 -fsigned-char -fPIC -pthread -std=c++14 -W -Wall -O2 -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wshadow -Wno-long-long -pedantic -
while machine is Xeon E5-2630L @ 2.00GHz definitely not supporting avx2 @mato can we remove the machine lcgapp-slc6-physical2 from the avx2+FMA partition?
added 1 commit
- 217d85db - Fixed scalar dispatch for MSC. GV benchmarks produce a timing average.
Jenkins Build FAILUREResults available at: Jenkins [GeantV-gitlab #632]
added 1 commit
- 239ee9d3 - Info printed from handlers during initialization.
Jenkins Build FAILUREResults available at: Jenkins [GeantV-gitlab #633]
added 1 commit
- 672516fd - Fixed geometry file for cms examples and benchmarks. Fixing scripts.
Jenkins Build SUCCESSResults available at: Jenkins [GeantV-gitlab #634]
added 2 commits
Jenkins Build FAILUREResults available at: Jenkins [GeantV-gitlab #635]
Jenkins Build ABORTEDResults available at: Jenkins [GeantV-gitlab #636]