StalledEventMonitor not available for multithreaded jobs
StalledEventMonitor
is a practical small service that can detect when an event is taking too much time and, if requested, kill the process when that happens.
Unfortunately it relies on the BeginEvent incident, which is not used in multithreaded jobs... and even if it was used it would not help for StalledEventMonitor
because it doesn't say which event slot is being started.
What we need is a mechanism to keep track of all events in flight to see if one is taking too much time. The mechanism could be integrated in the scheduler or we can have a protocol between the scheduler and the StalledEventMonitor
, so that the same protocol can be used by different schedulers.