Skip to content
Snippets Groups Projects

Add minimum momentum cut in hybrid seeding

Merged Louis Henry requested to merge lohenry-addSeedingPCut into master

See !3409 (comment 6731059)

@jonrob

Without cut:

SeedTrackChecker_6f38a9cb              INFO Results
SeedTrackChecker_6f38a9cb              INFO **** Seed                             9558 tracks including            479 ghosts [ 5.01 %], Event average  4.82 % ****
SeedTrackChecker_6f38a9cb              INFO   01_hasT                        :    6497 from     7781 [ 83.50 %]      0 clones [ 0.00 %], purity: 99.56 %, hitEff: 98.00 %
SeedTrackChecker_6f38a9cb              INFO   02_long                        :    4642 from     4960 [ 93.59 %]      0 clones [ 0.00 %], purity: 99.70 %, hitEff: 98.60 %
SeedTrackChecker_6f38a9cb              INFO   03_long_P>5GeV                 :    3291 from     3420 [ 96.23 %]      0 clones [ 0.00 %], purity: 99.67 %, hitEff: 99.09 %

With cut at 4500:

SeedTrackChecker_6f38a9cb              INFO Results
SeedTrackChecker_6f38a9cb              INFO **** Seed                             6894 tracks including            284 ghosts [ 4.12 %], Event average  3.88 % ****
SeedTrackChecker_6f38a9cb              INFO   01_hasT                        :    4998 from     7781 [ 64.23 %]      0 clones [ 0.00 %], purity: 99.55 %, hitEff: 98.67 %
SeedTrackChecker_6f38a9cb              INFO   02_long                        :    3816 from     4960 [ 76.94 %]      0 clones [ 0.00 %], purity: 99.69 %, hitEff: 99.04 %
SeedTrackChecker_6f38a9cb              INFO   03_long_P>5GeV                 :    3292 from     3420 [ 96.26 %]      0 clones [ 0.00 %], purity: 99.67 %, hitEff: 99.09 %

I will be very curious to see the speedup. The cut is done at the earliest possible level and so the speedup is probably much, much better than nTracks(cut)/nTracks(without cut). I am more expecting a factor 4 or so, and if needed I can provide an even faster version (factor 2 or more), but this will start actually cutting slightly on performance.

Edited by Louis Henry

Merge request reports

Merge request pipeline #5588594 passed

Merge request pipeline passed for 5016fd44

Merged by Sebastien PonceSebastien Ponce 1 year ago (May 17, 2023 9:04am UTC)

Loading

Pipeline #5594295 passed

Pipeline passed for cee96a19 on master

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Louis Henry added 1 commit

    added 1 commit

    Compare with previous version

  • Louis Henry added 1 commit

    added 1 commit

    Compare with previous version

  • Christopher Rob Jones
  • mentioned in merge request Panoptes!262 (merged)

    • So I have started a regular ci-test in Panoptes!262 (merged) which now also using the option added here.

      Running locally though, a longer calibration test, I get the following

      ref-calib-no-seeding-p-cut.log

      ref-calib-with-seeding-p-cut.log

      In short, I see about a factor 4 speed up in the seeding. The end result in terms of the stats used for the calibration is essentially unchanged, as expected.

      The cut I am using is 10GeV. The impact could be even higher in the alignment (@jreich @pnaik @amarshal ) which use much tighter cuts.

      Edited by Christopher Rob Jones
    • Author Developer

      Excellent! Are you interested in further speed-ups? For instance, can you try running with

      L0_tolHp = [280, 540, 0]

      ?

    • I can try, but the seeding is no longer really dominant.

       | "PrKalmanFilterForward_944e4f66"                |     2.14519e+06 |        1912.032 |          891.310 |
       | "PrKalmanFilterMatch_5577cfdb"                  |     2.14519e+06 |        1908.046 |          889.452 |
       | "PrStoreSciFiHits_3cf8a836"                     |     2.14519e+06 |        1194.192 |          556.683 |
       | "FTRawBankDecoder"                              |     2.14519e+06 |        1125.662 |          524.737 |
       | "RichPhotonRecoLong_5c59f803"                   |     2.14519e+06 |         978.414 |          456.096 |
       | "PrHybridSeeding_f8d90f25"                      |     2.14519e+06 |         970.347 |          452.335 |
       | "RichPredPixelSignalLong_de24959d"              |     2.14519e+06 |         871.553 |          406.282 |
       | "RichSIMDPixels_4c6cdf50"                       |     2.14519e+06 |         698.652 |          325.682 |
       | "RichRawDecoder_6bf9d1d9"                       |     2.14519e+06 |         629.969 |          293.665 |
       | "VeloRetinaClusterTrackingSIMD_ceb40d92"        |     2.14519e+06 |         625.379 |          291.526 |
       | "PrForwardTrackingVelo_e05c8dc1"                |     2.14519e+06 |         620.011 |          289.023 |
       | "RichMassConesLong_cbc74925"                    |     2.14519e+06 |         390.147 |          181.870 |
       | "RichPixelClustering_1b99ed52"                  |     2.14519e+06 |         207.254 |           96.613 |
       | "RichTrackSegmentsLong_c41400dc"                |     2.14519e+06 |         138.636 |           64.626 |
       | "TBTCMatch_a0b610d8"                            |     2.14519e+06 |         101.735 |           47.424 |
      <snip>

      If I want to improve further, it would be better to focus on e.g. the track fit, and then PrStoreSciFiHits and FTRawBankDecoder. How much CPU optimisation has been done for these last two ;) ?

      Edited by Christopher Rob Jones
    • Actually... The above times could be biased. I run on a data set which has a lot of run changes, way more than normal, and if these algs have to re-cache conditions each time this could really bias things...

    • the last two were never a bottleneck and are thus not heavily optimised :) the track fit has no low-hanging fruits I am afraid

    • Yeah, I figured the track fit had seen a lot of effort so there was likely nothing easy to gain there, but wasn’t sure about the other two. Good to hear that perhaps with a bit of profiling some gains might be possible there.

    • B.t.w. I ran a local test over a big sample from a single run, so without all the run changes. The results where not massively different to above so PrStoreSciFiHits and FTRawBankDecoder where still two of the main cpu contributors.

    • Author Developer

      FTRawBankDecoder was supposed to be relatively optimised, although there is 'now' a sorting function that could actually be done another way. PrStoreSciFiHits is very non-optimal though.

    • Please register or sign in to reply
  • Sebastien Ponce approved this merge request

    approved this merge request

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading