Skip to content
Snippets Groups Projects

ARM support

Merged Arthur Marius Hennequin requested to merge ahennequ_armSupport into master

This set of MRs make the necessary changes to build and run on arm platforms. Tested on arm64-thunderx2.

Goes with Rec!2205 (merged) Phys!783 (merged) Allen!434 (closed)

@bcouturi @sponce @clemenci

Edited by Alessandro Scarabotto

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Arthur Marius Hennequin changed the description

    changed the description

  • Sebastien Ponce approved this merge request

    approved this merge request

  • mentioned in merge request Rec!2205 (merged)

  • mentioned in merge request Phys!783 (merged)

  • mentioned in merge request Allen!434 (closed)

    • Resolved by Arthur Marius Hennequin

      Anticipating comments, here is what the changes to make the timing table cross-platforms looks like, compared to the rdtsc method (runing on amd):

      This branch, on ARM:

      HLTControlFlowMgr                      INFO Average ticks per millisecond: 1000000
      HLTControlFlowMgr                      INFO 
       | Name of Algorithm                               | Execution Count | Total Time / s  | Avg. Time / us   |
       | "LHCb__MDF__IOAlg"                              |           1e+06 |          94.695 |           94.695 |
       | "DummyEventTime"                                |          927873 |           3.947 |            4.253 |
       | "reserveIOV"                                    |          927873 |           4.734 |            5.102 |
       | "VeloClusterTrackingSIMD"                       |          927873 |        4697.825 |         5063.005 |
       | "PrStorePrUTHits"                               |          927873 |          93.556 |          100.828 |
       | "PrVeloUT"                                      |          927873 |         370.566 |          399.371 |
       | "FTRawBankDecoder"                              |          927873 |          98.871 |          106.556 |
       | "SciFiTrackForwardingStoreHit"                  |          927873 |         115.238 |          124.196 |
       | "SciFiTrackForwarding"                          |          927873 |         305.682 |          329.443 |
       | "MuonRawToHits"                                 |          927873 |          78.332 |           84.421 |
       | "MuonIDHlt1AlgPr"                               |          927873 |          52.800 |           56.904 |
       | "VeloKalman"                                    |          927873 |          19.385 |           20.892 |
       | "MakeZip__PrFittedForwardTracks__PrMuonPIDs"    |          927873 |           5.921 |            6.381 |
       | "PrGECFilter"                                   |           1e+06 |           4.659 |            4.659 |
       | "TrackBeamLineVertexFinderSoA"                  |          927873 |         123.007 |          132.569 |

      Master, on x86:

      HLTControlFlowMgr                      INFO Average ticks per millisecond: 2200336
      HLTControlFlowMgr                      INFO 
       | Name of Algorithm                               | Execution Count | Total Time / s  | Avg. Time / us   |
       | "LHCb__MDF__IOAlg"                              |           2e+06 |         282.413 |          141.207 |
       | "DummyEventTime"                                |     1.85577e+06 |           2.659 |            1.433 |
       | "reserveIOV"                                    |     1.85577e+06 |           3.005 |            1.619 |
       | "VeloClusterTrackingSIMDFaster"                 |     1.85577e+06 |         630.727 |          339.874 |
       | "PrStorePrUTHits"                               |     1.85577e+06 |          64.759 |           34.896 |
       | "PrVeloUT"                                      |     1.85577e+06 |         169.711 |           91.450 |
       | "FTRawBankDecoder"                              |     1.85577e+06 |          67.106 |           36.161 |
       | "SciFiTrackForwardingStoreHit"                  |     1.85577e+06 |          60.310 |           32.499 |
       | "SciFiTrackForwarding"                          |     1.85577e+06 |         155.980 |           84.051 |
       | "MuonRawToHits"                                 |     1.85577e+06 |          91.598 |           49.358 |
       | "MuonIDHlt1AlgPr"                               |     1.85577e+06 |          31.238 |           16.833 |
       | "VeloKalman"                                    |     1.85577e+06 |          13.107 |            7.063 |
       | "MakeZip__PrFittedForwardTracks__PrMuonPIDs"    |     1.85577e+06 |           3.919 |            2.112 |
       | "PrGECFilter"                                   |           2e+06 |           2.691 |            1.345 |
       | "TrackBeamLineVertexFinderSoA"                  |     1.85577e+06 |          50.216 |           27.059 |

      IMO 1ns resolution should be enough for most algorithms.

      @sponce @nnolte @chasse

      Edited by Arthur Marius Hennequin
    • Resolved by Roel Aaij

      @ahennequ Are the diffs here and in Rec!2205 (merged) really all that is required to build and run the RICH reco ? I only ask as last time I heard Arm had a lot of problems with Vc which I use in the RICH, and none of the changes here seem to address that.

      Don't get me wrong, if it is I'm happy, I'm just a little surprised as something must have changed...

  • added 1 commit

    • 5e40e4da - Create LHCb::chrono::fast_clock and use it

    Compare with previous version

  • added 1 commit

    • af2eb9c8 - Create LHCb::chrono::fast_clock and use it

    Compare with previous version

  • added 1 commit

    • b6fdd299 - Move clock_frequency to Chrono.cpp

    Compare with previous version

  • added 1 commit

    • f3bcc4a9 - Move clock_frequency to Chrono.cpp

    Compare with previous version

  • added 1 commit

    • 5e44e2d8 - Remove duplicate function signature

    Compare with previous version

  • added Build label

  • added 1 commit

    • c8814baf - Fix SOACollection test on arm

    Compare with previous version

  • added 1 commit

    • 5ba6d8ae - Fix SOACollection test on arm

    Compare with previous version

  • added 1 commit

    • 69c3698b - Add Neon backend to SIMDWrapper

    Compare with previous version

  • added 1 commit

    • e7bf2ede - Add Neon backend to SIMDWrapper

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading