Failed to queue successfully transferred job batch to reporting stage
Summary
Dmitry (FNAL) reported on community that he was facing In ArchiveMount::reportJobsBatchTransferred(): got an exception
errors while archiving files. This error happens when a drive process tries to send a batch of successfully written to tape jobs to reporting.
We have faced this problem in the past a couple of times:
- https://gitlab.cern.ch/cta/operations/-/issues/1486 : unable to fully debug due to the log incident during the summer.
- https://gitlab.cern.ch/cta/operations/-/issues/462 : root cause was different to what FNAL are getting.
But the root cause now seems to be something different, probably related to dCache specific behaviour.
Steps to reproduce
- Unknown
Relevant logs and/or screenshots
All details in: https://cta-community.web.cern.ch/t/in-archivemount-reportjobsbatchtransferred-got-an-exception/338/4
"commit problem committing the DB transaction: Database library reported: ERROR: duplicate key value violates unique constraint \"archive_file_din_dfi_un\"DETAIL: Key (disk_instance_name, disk_file_id)=(eoscta, 0000AEC65EE5F77D43B89B1E759F9D0B4400) already exists. (DB Result Status:7 SQLState:23505)"