Skip to content

Re-introduce parallelism through loky

This MR stabilises the multiprocessing that was introduced in order to speed up the calculator. We have seen a number of concurrent.futures.process.BrokenProcessPool type errors that came up as a result of the openshift resource limits killing processes which broke the limits. Once this error occurred, the instance needed to be restarted in order for any calculator reports to be generated 😢. This was further exacerbated by the fact that the OS that is running the calculator reports that it has access to 10 CPUs, no matter what the CPU resource limit that exists. loki solves the problem of being able to continue smoothly in the case of terminated processes, and we take more control of the number of workers that we employ to distribute the calculator report generation.

Note that: I found that nested process pools was causing a slowdown, so went for a threadpool instead. This MR is about getting some stable behaviour which doesn't block the webserver, subsequent MRs will be about performance improvements (probably by flattening out the parallelism).

Edited by Philip Elson

Merge request reports