Skip to content

Remove wrong 'job owned after destruction' INFO message after request is requeued

! For more details check ops issue https://gitlab.cern.ch/cta/operations/-/issues/994#note_6451855.


Problem

  • The OStoreDB.cpp code does not set the flag m_jobOwned = false after flushing any requests back to a queue.
  • This causes the following INFO message, even though the object is no longer owned by the process:
[1676515360.020552000] Feb 16 03:42:40.020552 tpsrv045.cern.ch cta-taped: LVL="INFO" PID="27200" TID="27200" MSG="In OStoreDB::RetrieveJob::~RetrieveJob(): will leave the job owned after destruction." agentObject="DriveProcess-I3600523-tpsrv045.cern.ch-27200-20230216-03:28:18-0" jobObject="RetrieveRequest-Frontend-ctaproductionfrontend01.cern.ch-23457-20230126-14:50:53-0-4068867"
  • This happens inside of OStoreDB::RetrieveMount::requeueJobBatch(...) and any other function that flushes requests back to the queue with sorter.flushAll(..) or sorter.flushOneRetrieve(..) (sorter.flushOneArchive(..) is not used directly).

Solution

  • Simply set m_jobOwned = false after knowing that an object has been handled back to a queue successfully.
    • This is probably guaranteed after sorter.flushAll(..), sorter.flushOneRetrieve(..) or sorter.flushOneArchive(..) have returned successfully.