Skip to content

Proposal to change EOS disk fid to fxid

Problems to solve

  1. The EOS diskFileId value in CTA is stored as a decimal string. However, we cannot assume it is a number as dCache uses uuid-style disk IDs. We should therefore remove this assumption from our code.

  2. Some confusion has been caused by converting between hex and decimal representations of disk file IDs. We could take this opportunity to change to the hex representation throughout. This would also make the code a bit simpler as we don't have to convert between the two representations in various places.

More details

Currently, when we archive a file, EOS passes the disk file ID as a uint64. The value is converted to a decimal string before being stored in the objectstore as string diskfileid. This string is passed around for logging and is eventually stored in the CTA Catalogue as a string:

CREATE TABLE ARCHIVE_FILE(
...
  DISK_FILE_ID            VARCHAR(100)    CONSTRAINT ARCHIVE_FILE_DFI_NN  NOT NULL,

I think the only place where we convert the string value back into a number is here:

  • In xroot_plugins/GrpcEndpoint.cpp, we deference it to do the EOS namespace query.

A few methods in the Frontend are doing hex string to dec string conversion:

  • In xroot_plugins/XrdSsiCtaRequestMessage.cpp, a couple of admin commands convert the hex string to a dec string.
  • In xroot_plugins/XrdCtaRecycleTapeFileLs.hpp and xroot_plugins/XrdCtaRecycleTapeFileLs.hpp, we convert it from hex string to a number, check it is a valid number then convert it back into a string.
  • In xroot_plugins/XrdCtaTapeFileLs.hpp, cta-admin converts a list of EOS fxids (hex) to fids (dec). But in the case of a single fxid on the command line, the Frontend converts it from a hex string to a dec string.

Proposal

  1. Update all diskFileId strings in CTA to use the hex fxid throughout instead of the decimal fid.
  2. In the CTA Catalogue, from now on add the diskFileId in hex with 0x prefix. Already existing files will have the decimal string, new files will have the hexadecimal string.
  3. When doing SELECT FROM ARCHIVE_FILE, convert decimal strings into hexadecimal ones. (Can't do this as dCache will be an arbitrary string).
  4. Update GrpcEndpoint.cpp to treat the diskFileId as hex when dereferencing. All the other methods can simply use the hex value with no conversions.

The dec strings in the catalogue can optionally be converted to hex at a later stage.

Edited by Michael Davis