Add monitoring thread
The goal is to add a monitoring thread to Allen to produce histograms of the HLT1 rates (see https://gitlab.cern.ch/lhcb-parallelization/Allen/issues/102 and https://gitlab.cern.ch/lhcb-parallelization/Allen/issues/103).
The current version addresses https://gitlab.cern.ch/lhcb-parallelization/Allen/issues/102:
-
HostBuffers
managed by aHostBuffersManager
class - Indices passed to GPU threads for use by
Stream
-
HostBuffersManager
keepsqueue
s ofEmpty
andFilled
HostBuffers
to be passed to GPU and monitoring threads, respectively
Currently there is no monitoring thread so Filled
HostBuffers
are not being processed and emptied
Merge request reports
Activity
added 1 commit
- da81f25f - Added (currently trivial) monitoring function; Added monitoring thread(s) to async loop
Thanks for implementing this, it looks very good.
The number of
HostBuffers
instances is determined by the speed of the monitoring: if a buffer is not available because the monitoring is not done with it yet, a newHostBuffers
is allocated. This would result in the host running out of memory if the monitoring is slower than the processing.I would instead suggest that the number of
HostBuffers
instances is set to:n_streams + n_mon + 1
. The decision whether to monitor a givenHostBuffers
is then made as soon as the processor indicates that it is done: if a monitoring thread is available then it is passed the buffer, if no monitoring thread is available then the buffer is immediately "freed".We can then tune the number of monitoring threads depending on availability of resources in the machines; cores, memory bandwidth, etc. I could even imagine that we end up with two types of monitoring work:
- must do such as rate monitoring,
- optional such as monitoring of reconstructed quantities, where the number of threads assigned to each task is set independently.
On the issue of the number of
HostBuffers
, I agree that monitoring should be skipped if it lags behind the GPU threads but it might make sense to keep the ability to make a new buffer. In principle, this could also be needed if the GPU threads lag behind I/O (because buffers are assigned when the PROCESS is sent, not received). In practice, this is controlled bynumber_of_slices = n_streams + 1
(should the1
here ben_io
?) so perhaps the most future-safe option is to setnumber_of_buffers = number_of_slices + n_mon
(perhaps+1
for safety) - the same value but avoids problems if the number of slices is ever changed.I think
number_of_buffers = number_of_slices + n_mon (+ 1)
is a good idea. We should anyway measure performance of the monitoring threads and (host and device) memory usage.I don't think the dynamic allocating of extra
HostBuffers
is needed, but it can be left in.Edited by Roel Aaijadded 1 commit
- 5d21115f - Added rate monitoring class - histograms currently filled with 1's
added 1 commit
- 44843ff8 - fixed rate histograms for total number of decisions (events still...
added 1 commit
- 652425f6 - rate histograms now give number of selected events for each line
added 1 commit
- 13c43ed9 - Removed unused function arguments now HostBuffers are configured via manager
mentioned in issue #102 (closed)
added 1 commit
- bb2aaf78 - moved RateMonitor code to integration/monitoring
- Resolved by Daniel Charles Craik
- Resolved by Daniel Charles Craik
- Resolved by Daniel Charles Craik
- Resolved by Daniel Charles Craik
- Resolved by Daniel Charles Craik
added 1 commit
- 7c4d736e - moved ROOT functionality into main RateMonitor class
added 1 commit
- aa70fb1c - removed unused argument from monitoring thread