Investigate why garbage collector is taking 17 hours to requeue requests in the stress test
During the stress test some timings can result in the garbage collector iterating through all archived files for the current session of the tape server.
The garbage collector spends 17 hours iterating through owned objects of the agent in method GarbageCollector::OwnedObjectSorter::sortFetchedObjects. For each of this it logs skipping object which is not owned by this agent, since he does not own any of the jobs of the request.
The ammount of lines logged (over one million) suggests these jobs were archived by the tapeserver but for some reason are still owned by the agent when the garbage collector kicks in. This means that before the garbage collector requeues failed jobs, it will spend 17 hours doing useless work.
We must look for a way to prevent the garbage collector from kicking in before the tapeserver has released finished archive jobs.
First noticed here: https://gitlab.cern.ch/cta/CTA/-/issues/1212