Add log param `mountAttempted` to DataTransferSession
Summary
Currently, CTA reports tape sessions status as failed when a DataTransferSession
fails before even attempting to mount a tape. If the reason is not a one-off incident, like FST not reachable, we can end with all drives and tapes disabled.
The simplified logic of a DataTransferSession
is:
- Loop until we find a queue to pop some work to do.
- Before anything else, try to fetch an initial job.
- 2.1. If 2 succeeds: start all the required threads (this triggers the mount of the tape) until success or until we face some problem.
- 2.2. If 2 fails: abort the session. (more details in https://gitlab.cern.ch/cta/operations/-/issues/1539)
The 2.2 case will make TAS start disabling tapes and drives without a good reason for it.
The problem relies on the CTA-TAS communication.
Possible solutions
We will add a new field to the logs
1. We could introduce an
aborted
state along with success
and failure
, and only set it for aborts of the data transfer session.2. Classify all these cases as success
from the DataTranferSession
status
point of view.
At present, I do not see a valid use case 1. as we will get ERROR/WARNING depending on the root cause and we can just go for 2.