Reduce the amount of memory requested in each HIP stream. (!788) · Merge requests · LHCb / Allen

Daniel Hugo Campora Perez requested to merge dcampora_reduce_memory_hip into master Mar 04, 2022

This MR changes the HIP configuration of tests in two ways:

It reduces the amount of memory requested in each HIP stream from 3000 MB to 2800 MB. This results in a total requested memory of 28000 MB, with a remaining 4 GB free for Allen constants and other memory. This was prompted by the failure observed in MR !783 (merged), concretely https://gitlab.cern.ch/lhcb/Allen/-/jobs/19982275#L158.
It reduces the amount of events processed in each batch from 5000 to 2800. In principle we allocate more than we need and therefore the previous configuration of 3 GB for batches of 5k events was optimistic. However, given that we observe some issues with this configuration as more algorithms get added (eg. see !690 (comment 5334944)), the configuration of this MR is more conservative.

As a sidenote, none of these issues exist for the CUDA configurations, where we allocate 500 MB for batches of 500 events, nor for the CPU configuration, where we allocate 100 MB for batches of 100 events.

A performance reduction of the MI100 is therefore expected.

Edited Mar 04, 2022 by Daniel Hugo Campora Perez

Reduce the amount of memory requested in each HIP stream.

Merge request reports