Trigger cleanup session if taped child process did not exit with success code
! For more details check ops issue https://gitlab.cern.ch/cta/operations/-/issues/994#note_6451613.
Problem
- The main
cta-taped
process launches child processes to handle different tape sessions. - In
DriveHandler.cpp:673
, it checks if any of the child process exited abruptly (by not callingexit()
orreturn
frommain
).- If it did not exit correctly, it sets the
PreviousSession::Crashed
flag, which will cause the next forked child process to run a cleaner session before starting.
- If it did not exit correctly, it sets the
int rc = ::waitpid(m_pid, &processStatus, WNOHANG);
[...]
if (rc) {
[...]
if (WIFEXITED(processStatus)) {
[...]
} else {
[...]
resetToDefault(PreviousSession::Crashed);
[...]
}
}
-
However, all exceptions thrown by the Tape Daemon process will be caught in
TapeDaemon.cpp:57
, and replaced by areturn 1
that will be returned bymain
graciously. - As a result, the parent
cta-taped
process will never set thePreviousSession::Crashed
flag and the next tape session will not run the cleaner session.
Solution
[Alternative #1] (Simpler and probably better):
- After finding that a child process exited correctly, check if the return code is not
0
withWEXITSTATUS(processStatus)
. - If the return value was not
0
(zero), set thePreviousSession::Crashed
flag.- By looking into
TapeDaemon::main()
, we know that a child process will only return a non-zero value if an unexpected exception caused it to fail.
- By looking into
- This can be quickly done simply by adding a few lines of code inside
if (WIFEXITED(processStatus)) { ... }
.
[Alternative #2]
- Simply let the unhandled exception to be propagated up and cause the process to fail.
- This will cause
WIFEXITED(processStatus)
to returnfalse
, properly handle the failed child process and set thePreviousSession::Crashed
flag. - It is slightly more complicated and changes the way how tape server processes may terminate. The end result is the same as [Alternative #1].
Edited by Joao Afonso