Skip to content

Parallel RT OPT with fork/wait system calls

Stefano Camarda requested to merge rtopt-parallel into master

Implementing opepmp based multi-threading parallel calculation with the RT scheme is difficult due to the complicate pattern of memory sharing within QCDNUM. For this reason I have implemented parallel computation of F2 and FL in RT OPT by means of fork/wait calls, which fully duplicate the memory of the parent process into the child processes, avoiding any issue related to memory sharing. The drawback of fork/wait with respect to a multithreading implementation is that it is likely to have larger overhead. I had to include some additional flush of stdout and I/O buffers to avoid child processes to empty the same buffer multiple times at exit.

The number of parallel processes is controlled by the option threads (note however that these are not properly threads). The option threads = 0 runs the code as it was before. I have checked that the results do not depend on the number of parallel processes. It would be nice if someone could check performance on a machine with a large number of available cores.

Edited by Stefano Camarda

Merge request reports