Fix filing of disk buffer when recalling from tapeservers with RAO
Summary
When RAO is enabled tape drives will try to recall the biggest batch of files possible, for which they reserve all the space needed up front in the disk buffer. A typical value is a batch of maximum 2730 files or 1PB. This means that each drive with RAO will try to reserve multiple terabytes in the buffer, quickly filling it up and blocking archives.
Steps to reproduce
Queue a lot of recall jobs to be picked up by drives with RAO enabled. The drives will fill the buffer and trigger backpressure, triggering sleeping queues
What is the current bug behavior?
The disk buffer is reserved for recalls with RAO drives. Backpressure is triggered, causing other queues to sleep.
What is the expected correct behavior?
The buffer size reserved by the tapeserver should not change because of RAO.
Relevant logs and/or screenshots
See https://gitlab.cern.ch/cta/operations/-/issues/359
Possible fixes
Discussed with @jleduc. The tapeservers with RAO should still pop a large batch of jobs, because the more jobs the more likely they are to get a good read order from the RAO algorithm. They should order the jobs popped by RAO read order and just take a normal batch size from the top and requeueing the rest. The size of a normal batch is given by BulkRequestRecallMaxBytes
and BulkRequestRecallMaxFiles
configuration options in /etc/cta/cta-taped.conf