Expired
Milestone Jan 22, 2025–Mar 14, 2025

Tape Daemon Roadmap

The purpose of this milestone is to keep track of all individual tasks that will be done to refactor the daemon according to the changed proposed in 2024-07-25's dev meeting ; Presentation and any additional changes that might be required.

Plan

  1. Create daemon/service common code in the new-style daemon defined in SystemD
  2. Split Maintenance Process
    2.1 Review exit logic on maintenance process 2.2 Improve resilience and update monitoring as needed 2.3 Measure needs for maintenance process load 2.4 Deployment roadmap
  3. Changes to Tape Drive Process 3.1 3.2

Related tickets

While working on this milestone I have recollected a series of tickets related to this problem, the point of this section is to categorize them assigning them to one of the categories that will address the issue and close them.

Connectivity problems

  • Disk buffer: https://gitlab.cern.ch/cta/operations/-/issues/1132
  • Losing connectivity to SchedulerDB brings down drives: https://gitlab.cern.ch/cta/operations/-/issues/661
  • Losing connectivity to CatalogueDB brings down drives: https://gitlab.cern.ch/cta/operations/-/issues/1553 ; https://gitlab.cern.ch/cta/operations/-/issues/257

CTA - TAS Integration

  • Better deal with

CTA - Systemd Restart

  • https://gitlab.cern.ch/cta/operations/-/issues/1404 ; https://gitlab.cern.ch/cta/operations/-/issues/1206

Possible Features

  • Implement SCSI mode pages configuration from the daemon. Not external script running on a custom way for us. https://gitlab.cern.ch/cta/operations/-/issues/1411; https://gitlab.cern.ch/cta/operations/-/issues/1328
  • Error code system
  • Exit procedure for quick shutdown. Currently when we set a drive down, we have to either wait for a data transfer session to finish or kill the process and let the parent rerun the main logic to detect the desierd drive state down. This is not ok and makes harder updates. https://gitlab.cern.ch/cta/operations/-/issues/679
  • Always try to dismount tapes even on uncaught exception. https://gitlab.cern.ch/cta/operations/-/issues/1185
  • Improve tape location management. What happens when a tape is dismounted but does not make it to the slot?
  • Improve management of tape alerts. https://gitlab.cern.ch/cta/operations/-/issues/300
  • Work items 11
  • Merge requests 1
  • Participants 2
  • Labels 7
Loading
Loading
Loading
Loading
18% complete
18%
Start date
Jan 22, 2025
Jan 22
-
Mar 14 2025
Due date
Mar 14, 2025 (Past due)
11
Work items 11 New issue
Open: 9 Closed: 2
None
Total weight
None
1
Merge requests 1
Open: 0 Closed: 0 Merged: 1
0
Releases
None
Reference: cta/CTA%"Tape Daemon Roadmap"