Skip to content

Fix deceiving `cta-taped` read performance

Problem

As mentioned in https://gitlab.cern.ch/cta/operations/-/issues/1010, there is a small delay (around 2 seconds) between each file read from tape, even when they are read from successive blocks.

For large files this delay gets diluted and barely affects the final throughput. However, when recalling a large number of small files, this delay can result in an average reading speed half the speed of a raw dd reading the tape.

The fact that dd is much faster leads us to believe that there is work to be done to improve this.

Objective

Investigate what is causing this performance degradation and provide a solution.

Look into recent specifications, and see if we can improve the read performance by using better SCSI calls (link).

References

CTA label format document:

IBM reference document explaining how tape drives work:

Edited by Joao Afonso