Skip to content
Snippets Groups Projects
Commit 0138988f authored by Christos Anastopoulos's avatar Christos Anastopoulos
Browse files

GSFFindIndexOfMinimum : Documenation fixes

parent 58819a66
No related branches found
No related tags found
29 merge requests!78241Draft: FPGATrackSim: GenScan code refactor,!78236Draft: Switching Streams https://its.cern.ch/jira/browse/ATR-27417,!78056AFP monitoring: new synchronization and cleaning,!78041AFP monitoring: new synchronization and cleaning,!77990Updating TRT chip masks for L1TRT trigger simulation - ATR-28372,!77733Draft: add new HLT NN JVT, augmented with additional tracking information,!77731Draft: Updates to ZDC reconstruction,!77728Draft: updates to ZDC reconstruction,!77522Draft: sTGC Pad Trigger Emulator,!76725ZdcNtuple: Fix cppcheck warning.,!76611L1CaloFEXByteStream: Fix out-of-bounds array accesses.,!76475Punchthrough AF3 implementation in FastG4,!76474Punchthrough AF3 implementation in FastG4,!76343Draft: MooTrackBuilder: Recalibrate NSW hits in refine method,!75729New implementation of ZDC nonlinear FADC correction.,!75703Draft: Update to HI han config for HLT jets,!75184Draft: Update file heavyions_run.config,!74430Draft: Fixing upper bound for Delayed Jet Triggers,!73963Changing the path of the histograms to "Expert" area,!73875updating ID ART reference plots,!73874AtlasCLHEP_RandomGenerators: Fix cppcheck warnings.,!73449Add muon detectors to DarkJetPEBTLA partial event building,!73343Draft: [TrigEgamma] Add photon ringer chains on bootstrap mechanism,!72336Fixed TRT calibration crash,!72176Draft: Improving L1TopoOnline chain that now gets no-empty plots. Activating it by default,!72012Draft: Separate JiveXMLConfig.py into Config files,!71876Fix MET trigger name in MissingETMonitoring,!71820Draft: Adding new TLA End-Of-Fill (EOF) chains and removing obsolete DIPZ chains,!71656GSFFindIndexOfMinimum : Documentation fixes
......@@ -14,13 +14,12 @@
* possible implementation
*
* The issues are described in ATLASRECTS-5244
* Some timing improvements in the overall time
* for the algorithm
* Some timing improvements in the overall
* GSF refitting algorithm time can be found at :
* https://gitlab.cern.ch/atlas/athena/-/merge_requests/67962
*
* At large a slow implmentation can slow
* significantly the time
* of the overall algorithm.
* At large a slow implmentation can increase
* significantly the time for the GSF refititng
* algorithm.
*
* There is literature in the internet
* namely in blogs by Wojciech Mula
......@@ -29,25 +28,23 @@
* integers using intrinsics and various
* AVX levels.
*
* In Atlas currently we need to solve it for float.
* In ATLAS currently we need to solve it for float.
* Furthermore, after discussion with Scott Snyder
* we opted for using the gnu vector types.
* we opted for using the gnu vector types from "CxxUtils/vec.h".
* And we target x86_64-v2.
* In this aimplementations a vec<float,4> vec<int,4>
* is a 4 wide register. And we do operation explicitly
* 4 elements a time.
*
* For completeness and future comparisons
* we collect
*
* - A "C" implementation
* - A "STL" implementation
* - A "Vec" implementation always tracking the index
* - A "C" implementation.
* - A "STL" implementation.
* - A "Vec" implementation always tracking the index.
* - A "Vec" implementation that updates the index when an new minimum is
* found. This can be faster than the above when the inputs are not ordered.
* - A "Vec" implementation that updates that find the minimum and then
* finds the index. This should be faster in most cases
*
* In the vec implementations a vec<float,4> vec<int,4>
* is a 4 wide register. And we do operation explicit but 4 elements a time.
* Still prb much readable than using intrinsics.
* - A "Vec" implementation that first finds the minimum and then
* finds the index. This can be faster in many cases.
*
* We provide a convenient entry method
* to select in compile time an implementation
......@@ -344,7 +341,7 @@ float vecFindMinimum(const float* distancesIn, int n) {
return minvalue;
}
ATH_ALWAYS_INLINE
int32_t vecIdxofValue(const float value, const float* distancesIn, int n) {
int32_t vecIdxOfValue(const float value, const float* distancesIn, int n) {
using namespace CxxUtils;
const float* array =
std::assume_aligned<GSFConstants::alignment>(distancesIn);
......@@ -368,10 +365,12 @@ int32_t vecIdxofValue(const float value, const float* distancesIn, int n) {
// 4
vload(values4, array + i + 12); // 12-15
vec<int, 4> eq4 = values4 == target;
//See if we have the value in any
//of the vectors
vec<int, 4> eq12 = eq1 || eq2;
vec<int, 4> eq34 = eq3 || eq4;
vec<int, 4> eqAny = eq12 || eq34;
//If yes then use scalar code to locate it
if (vany(eqAny)) {
for (int32_t idx = i; idx < i + 16; ++idx) {
if (distancesIn[idx] == value) {
......@@ -389,7 +388,7 @@ int32_t vecMinThenIdx(const float* distancesIn, int n) {
const float* array =
std::assume_aligned<GSFConstants::alignment>(distancesIn);
const float min = vecFindMinimum(array, n);
return vecIdxofValue(min, array, n);
return vecIdxOfValue(min, array, n);
}
} // namespace findIdxOfMinDetail
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment