Speedup fetching large number of datasets with parallelisation and restarting stuck requests (!429) · Merge requests · cms-analysis / General / HiggsDNA

Fetching file names can be very slow if one passes a txt file with many dataset names to the fetching function (grid version). From my experience, a significant fraction of the requests gets stuck and hence the entire function gets stuck.

With this commit, it will be much faster due to two improvements:

the requests are parallelised with futures
the requests that get stuck are retried. I use a timeout of 10 seconds here (although something smaller also seems to work well...)

Can you have a look please, @jspah, @ikrommyd?

Speedup fetching large number of datasets with parallelisation and restarting stuck requests

Merge request reports