Refactor the FieldCache::getB in order to enable multiversioning for the vectorized part
Refactor the FieldCache::getB in order to enable multiversioning for the vectorized part.
Note that I keep it avx2
and avoid avx2+fma
as fma
could affect the numerical stability here
On what I am looking at this plays a role in the InDet fitters that use the full field (not just fast/ ZR).
I have run the with/without 4 times in the usual ttbar
and calo +tracking + egamma
for a skylake
and the PerfMonMT seem to be
before
PerfMonMTSvc INFO Execute 99 61886 InDetAmbiguitySolver
PerfMonMTSvc INFO Execute 99 57843 InDetExtensionProcessor
PerfMonMTSvc INFO Execute 99 9740 EMBremCollectionBuilder
after
PerfMonMTSvc INFO Execute 99 60181 InDetAmbiguitySolver
PerfMonMTSvc INFO Execute 99 56359 InDetExtensionProcessor
PerfMonMTSvc INFO Execute 99 9300 EMBremCollectionBuilder
So it seems it actualy help...
Edited by Christos Anastopoulos