Scheduler performance testing/improvement with many small algorithms
Motivated by an ATLAS discussion about a possible refactoring of code, I intend to test scheduler performance when there are many algorithms and few dependencies w.r.t. the thread pool size.
Possible areas of investigation:
Develop/examine existing stress-test JO(s) for scheduler performance.
I have observed better scheduler performance with more nesting of CF structures (assumed due to status caching effect of CF nodes). Test, verify, potentially use to improve performance elsewhere.
In the past it was suggested that DATAREADY algorithms should be queued, rather than directly scheduled, to avoid inefficiency when the thread pool is full. Prototype, quantify improvement.
Potential to improve scheduling performance by separating EventView/subSlot graph traversal from regular system. Started in !735, need to complete and quantify improvement.