Skip to content

Standardize Service/Daemon Handling

Problem to solve

Currently the CTA services (frontend gRPC, tape server and the future maintenance), have slightly different ways of being set up and the interaction with some system signals is not smooth. As most of this work is in development or in the process of being refactored it is a good time to look for a common architecture.

Stakeholders

  • CTA Dev Team

Proposal

As we will stick with systemd we should follow its guidelines for New-Style Daemons in its man page https://man7.org/linux/man-pages/man7/daemon.7.html

On top of that:

  • Daemons/Services should never exit and in case of exit due to a crash, SystemD must restart them.
  • Introduce log messages with more priority than CRIT (we can take [1] as reference) for daemon problems that cannot reach their main logic loop. And build monitoring/alerting around those.
  • Upgrade procedures must be fast and transparent (no waiting for a drive to finish a data transfer session, which in some cases requires forcefully killing the drive process).

[1] https://github.com/openbsd/src/blob/master/sys/sys/syslog.h#L53-L60