Skip to content

Create a tool to change the storage class of data which has already been written to tape

Problem to solve

Sometimes we want to change the storage class of an archived file. The main use case is to separate files into a new tapepool during repack operations. We may also want to change the number of copies of a stored file.

Stakeholders

CTA operators who need to reorganise data and repack tapes.

Additional information

The storage class of files and directories is stored in three places:

  • Extended attributes of directories in the EOS namespace: this is inherited by new files created in that directory and used for archival only
  • Extended attributes of a file: on file creation, this is used to select the tapepool for archival. It is not used for retrieval. It is not used during repack but this should be checked.
  • In the ARCHIVE_FILE table. This is what is used during repack (it should be checked).

Proposal

  • It should only be possible to change the storage class of files which are already archived to tape. No changing storage class of files which are in-flight!
  • As this needs to change both the EOS namespace and the CTA catalogue, it should be a standalone tool (like cta-verify-file or cta-send-event)
  • The tool should accept a list of files as input
  • For each file, update the storage class in the ARCHIVE_FILE table in the CTA catalogue and the extended attribute sys.archive.storage_class in the namespace
  • In the case of directories, only the extended attribute needs to be updated
  • Include a "consistency check" option which reads the storage class from the catalogue and checks/sets the correct value in the namespace
  • Coherency of storage class in namespace/DB should be included into the reconciliation checks

Checklist

  • Check that new storage class exists in catalogue before changes are made in eos
  • Update storage class name in eos
  • Update storage class name in CTA
  • Add support for a filename as input
  • Avoid changing files which are in flight
  • Add more info to log
  • Improve error handling when a file is not updated in eos
  • Restrict the frequency of requests
  • Add man page
Edited by Lasse Tjernaes Wardenaer