Restart failed drive handlers without restarting the tape daemon
Problem to solve
Currently, the tape daemon, on multi drive setups, poses a problem because when a drive handler requests shutdown it shutdowns everything else including healthy drive handler processes. There is development work to solve this that should be available in the next public release, although the
During the pre-GDB it was mentioned, and accepted, to develop a feature to be able to restart the drive handlers without restarting the tape daemon.
Stakeholders
- CTA
- dCache
Proposal
An initial proposal debated during dev meeting 2023-11-10 was to use the SignalHandler to receive a specific signal so that it triggers the recreation of the DriveHandler. Another alternative could be to use the looping of the ProcessManager to regenerate the killed drive handler.