Expired
Milestone
Jan 22, 2025–Mar 14, 2025
Tape Daemon Roadmap
The purpose of this milestone is to keep track of all individual tasks that will be done to refactor the daemon according to the changed proposed in 2024-07-25's dev meeting ; Presentation and any additional changes that might be required.
Plan
- Create daemon/service common code in the new-style daemon defined in SystemD
- Split Maintenance Process
2.1 Review exit logic on maintenance process 2.2 Improve resilience and update monitoring as needed 2.3 Measure needs for maintenance process load 2.4 Deployment roadmap - Changes to Tape Drive Process 3.1 3.2
Related tickets
While working on this milestone I have recollected a series of tickets related to this problem, the point of this section is to categorize them assigning them to one of the categories that will address the issue and close them.
Connectivity problems
- Disk buffer: https://gitlab.cern.ch/cta/operations/-/issues/1132
- Losing connectivity to SchedulerDB brings down drives: https://gitlab.cern.ch/cta/operations/-/issues/661
- Losing connectivity to CatalogueDB brings down drives: https://gitlab.cern.ch/cta/operations/-/issues/1553 ; https://gitlab.cern.ch/cta/operations/-/issues/257
CTA - TAS Integration
- Better deal with
CTA - Systemd Restart
- https://gitlab.cern.ch/cta/operations/-/issues/1404 ; https://gitlab.cern.ch/cta/operations/-/issues/1206
Possible Features
- Implement SCSI mode pages configuration from the daemon. Not external script running on a custom way for us. https://gitlab.cern.ch/cta/operations/-/issues/1411; https://gitlab.cern.ch/cta/operations/-/issues/1328
- Error code system
- Exit procedure for quick shutdown. Currently when we set a drive down, we have to either wait for a data transfer session to finish or kill the process and let the parent rerun the main logic to detect the desierd drive state down. This is not ok and makes harder updates. https://gitlab.cern.ch/cta/operations/-/issues/679
- Always try to dismount tapes even on uncaught exception. https://gitlab.cern.ch/cta/operations/-/issues/1185
- Improve tape location management. What happens when a tape is dismounted but does not make it to the slot?
- Improve management of tape alerts. https://gitlab.cern.ch/cta/operations/-/issues/300
Loading
Loading
Loading
Loading