Prevent free drives to stall in production

added Object Store cta: Scheduler cta: Tape server prioritymedium typebug workflowbacklog labels

Check the mechanism taking global lock when a tape is moved from DISABLED to REPACKING state and it is in REPACK PENDING de-queueing user jobs (unless 2nd replica available, then to a new queue) one by one. Apparently this is causing global lock to kill production.

Combination of free tape servers looking for the lock + user request re-queueing one by one.

removed prioritymedium label

added prioritycritical label

@poliverc is tracking the timeline of events and debugging in this CodiMD https://codimd.web.cern.ch/fWvlG7_fSsKi8uAyXd7H4g#

assigned to @poliverc

Prevent free drives to stall in production

Designs

Child items ...

Activity