Skip to content

Add back-off behaviour to cta-rmcd

Problem to solve

On IBM tape libraries, too many simultaneous dismount requests leads to contention which causes cta-rmcd to give up and return an error to cta-taped. This causes failed sessions when there is no problem with the drive or the tape.

(Spectra Logic libraries are not affected by this issue as the DriveIQ feature defers the dismount until the next tape is ready to be mounted).

Stakeholders

Less noise for tape operators dealing with the consequences of failed sessions. Less disruption to tape operations, especially important out-of-hours.

Proposal

Replace the current "try 10 times and give up" with an exponential back-off.

Maximum timeout before failure should be a configurable parameter defaulting to 10 minutes (= 20 drives dismounting simultaneously per library × 30 seconds for robotics to perform the dismount).