Log when a file cannot be deleted from the Catalogue
UPDATE 2024-11-25: The problems described is the intended behaviour according to the docs https://eoscta.docs.cern.ch/v4/lifecycle/Delete/ and we rely on a reconciliation process to solve the discrepancies (I personally do not like this and we should fail the deletion).
The proper solution is to allow EOS to delete a file from its namespace even if CTA fails to delete the actual tape files. This will only result in temporary dark tape data which is not a critical problem. CTA can asynchronously reconcile its tape file catalogue with the EOS namespace at a later point in time.
So, the solution for now is to, at least, write in the frontend logs that that file was not deleted from the Catalogue in a WARNING
message.
Summary
Upon deletion of a file on the EOS namespace, the WFE will notify the Frontend to delete the file. The Frontend returns a success response when there is any exception while trying to delete the file from the catalogue, even when the exception is generated by the Catalogue not been reachable.
Steps to reproduce
- Setup dev environment.
- Archive some file.
- Kill catalogue pod/connection.
- Issue the removal of the file on the eos namespace. It will succeed. Should not.
Possible causes
We should better manage the reason of the exception when contacting the Catalogue:
// Delete the file from the catalogue or from the objectstore if archive request is created
utils::Timer t;
log::TimingList tl;
try {
request.archiveFile = m_catalogue.ArchiveFile()->getArchiveFileById(request.archiveFileID);
tl.insertAndReset("catalogueGetArchiveFileByIdTime",t);
} catch (exception::Exception&){
log::ScopedParamContainer spc(m_lc);
spc.add("fileId", request.archiveFileID);
m_lc.log(log::DEBUG, "Ignoring request to delete archive file from the catalogue, because it does not exist");
}
m_scheduler.deleteArchive(m_cliIdentity.username, request, m_lc);
tl.insertAndReset("schedulerTime",t);
// Create a log entry
log::ScopedParamContainer params(m_lc);
params.add("fileId", request.archiveFileID)
.add("address", (request.address ? request.address.value() : "null"))
.add("filePath",request.diskFilePath);
tl.addToLog(params);
m_lc.log(log::INFO, "In WorkflowEvent::processDELETE(): archive file deleted.");
// Set response type
response.set_type(xrd::Response::RSP_SUCCESS);