20% speed up in pixel tracking by splitting and early breaking of pair creation for forward/backward velo tracks
This merge was reverted by !832 (merged) and has thus not been applied to master
. It will be applied to TDR
branch instead, and will make it to master
when TDR
is merged to master
.
- Forward velo tracks can be identified by having all Zhits > -100 mm ,
- Backward velo tracks can be identified by having all Zhits < 100 mm.
In the algorithm one can check from dx , dy that dr/dz > 0 (<0) if looking to pairs of hits in z > -100 ( z< 100 mm ) and make an early killing of combinatorics in the pair creation. The original velo pixel tracking was searching pairs scanning from LastModule ( most downstream ) to FirstModule (the most downstream). When dealing with forward tracks this is kept, for backward tracks search, the best is to reverse the order of looping.
Indeed, the most time consuming part is the central region z in [-100, 100] mm which is shared between the forward/backward track search, but if in a sequential approach you flag hits starting downstream ( upstream) , you likely save computing time because you will find (at the end of the execution) hits flagged between [-100, 100] mm
According to the flagging of the hits, one can perform first a forward track search and later a backward track search. In the current implementation you can come back to original simple.
From a first check , this brings a 25% speed-up in the actual calculations with a tiny drop of efficiencies.
From a comparison of performance on 1000 minBias event with seeding and velo tracking [ independent algorithms ]
Default
lb-run --nightly-cvmfs --nightly --lhcb-future Brunel/future
PrChecker2Fast INFO Results
PrChecker2 INFO Results
PrChecker2.Velo INFO **** Velo 212406 tracks including 3741 ghosts [ 1.76 %], Event average 1.26 % ****
PrChecker2.Velo INFO 01_velo : 105761 from 107069 [ 98.78 %] 3226 clones [ 2.96 %], purity: 99.81 %, hitEff: 95.00 %
PrChecker2.Velo INFO 02_long : 61612 from 61845 [ 99.62 %] 775 clones [ 1.24 %], purity: 99.85 %, hitEff: 97.80 %
PrChecker2.Velo INFO 03_long>5GeV : 39500 from 39626 [ 99.68 %] 342 clones [ 0.86 %], purity: 99.86 %, hitEff: 98.36 %
PrChecker2.Velo INFO 04_long_strange : 3107 from 3139 [ 98.98 %] 32 clones [ 1.02 %], purity: 99.34 %, hitEff: 97.74 %
PrChecker2.Velo INFO 05_long_strange>5GeV : 1503 from 1525 [ 98.56 %] 11 clones [ 0.73 %], purity: 99.15 %, hitEff: 98.25 %
PrChecker2.Velo INFO 06_long_fromB : 64 from 66 [ 96.97 %] 0 clones [ 0.00 %], purity: 99.78 %, hitEff: 98.72 %
PrChecker2.Velo INFO 07_long_fromB>5GeV : 46 from 46 [100.00 %] 0 clones [ 0.00 %], purity: 99.69 %, hitEff: 98.74 %
Forward_tracks + Backward_tracks splitted (with these changes)
On top of current master branch
PrChecker2Fast INFO Results
PrChecker2 INFO Results
PrChecker2.Velo INFO **** Velo 211305 tracks including 2746 ghosts [ 1.30 %], Event average 0.96 % ****
PrChecker2.Velo INFO 01_velo : 105545 from 107069 [ 98.58 %] 3156 clones [ 2.90 %], purity: 99.83 %, hitEff: 95.01 %
PrChecker2.Velo INFO 02_long : 61536 from 61845 [ 99.50 %] 763 clones [ 1.22 %], purity: 99.86 %, hitEff: 97.79 %
PrChecker2.Velo INFO 03_long>5GeV : 39477 from 39626 [ 99.62 %] 340 clones [ 0.85 %], purity: 99.87 %, hitEff: 98.35 %
PrChecker2.Velo INFO 04_long_strange : 3089 from 3139 [ 98.41 %] 29 clones [ 0.93 %], purity: 99.35 %, hitEff: 97.80 %
PrChecker2.Velo INFO 05_long_strange>5GeV : 1498 from 1525 [ 98.23 %] 11 clones [ 0.73 %], purity: 99.15 %, hitEff: 98.28 %
PrChecker2.Velo INFO 06_long_fromB : 64 from 66 [ 96.97 %] 0 clones [ 0.00 %], purity: 99.78 %, hitEff: 98.72 %
PrChecker2.Velo INFO 07_long_fromB>5GeV : 46 from 46 [100.00 %] 0 clones [ 0.00 %], purity: 99.69 %, hitEff: 98.74 %
Concerning timing: From nightly:
TimingAuditor.T... INFO Reco | 28.720 | 28.825 | 0.950 444.8 41.05 | 1000 | 28.826 |
TimingAuditor.T... INFO RecoDecodingSeq | 1.110 | 1.188 | 0.426 6.8 0.46 | 1000 | 1.188 |
TimingAuditor.T... INFO createFTClusters | 0.130 | 0.161 | 0.045 0.8 0.07 | 1000 | 0.162 |
TimingAuditor.T... INFO PrStoreFTHit | 0.930 | 1.002 | 0.364 6.6 0.40 | 1000 | 1.003 |
TimingAuditor.T... INFO RecoTrFastSeq | 27.610 | 27.621 | 0.454 441.6 40.75 | 1000 | 27.621 |
TimingAuditor.T... INFO PrPixelTrackingFast | 4.900 | 4.829 | 0.067 34.0 3.68 | 1000 | 4.829 |
TimingAuditor.T... INFO PrHybridSeedingBest | 22.660 | 22.745 | 0.349 419.3 37.95 | 1000 | 22.746 |
With the fix:
TimingAuditor.T... INFO Reco | 27.280 | 27.322 | 0.956 460.6 38.58 | 1000 | 27.322 |
TimingAuditor.T... INFO RecoDecodingSeq | 1.330 | 1.185 | 0.467 4.6 0.38 | 1000 | 1.185 |
TimingAuditor.T... INFO createFTClusters | 0.200 | 0.162 | 0.049 0.6 0.06 | 1000 | 0.162 |
TimingAuditor.T... INFO PrStoreFTHit | 1.110 | 0.998 | 0.392 4.0 0.32 | 1000 | 0.999 |
TimingAuditor.T... INFO RecoTrFastSeq | 25.950 | 26.119 | 0.461 456.0 38.28 | 1000 | 26.119 |
TimingAuditor.T... INFO PrPixelTrackingFast | 3.800 | 3.883 | 0.065 27.3 2.50 | 1000 | 3.884 |
TimingAuditor.T... INFO PrHybridSeedingBest | 22.080 | 22.194 | 0.346 428.6 36.38 | 1000 | 22.194 |
According to this, the speed-up is ~21%.
People which may be interested in this and comment on the change: @adudziak , @decianm , @graven , @chasse , @raaij , @fpolci, @sponce , @cofitzpa , @gligorov let me know if there are some corner case i am missing ( for instance tracks at large z with a dr/dz < 0 ? ) and if what I did here looks reasonable and if I have to test the timing improvement w.r.t to the current master branch.