Skip to content

feature: test: Introduce python-based regression suite

Giovanna Lazzari Miotto requested to merge glazzari/feature/test-suite into main

This suite will replace our CI pipeline's current bash script ( scripts/test_filedma.sh ) and support further software stack development.

Caveat: configuration is not very cohesive in this test suite (UPDATE: this has since improved, but it remains far from final). Regardless, I see this MR as a sufficient improvement to defer changes until requirements mature.

Some features:

  • Replaces a bash script with several Python3.6 files using standard libraries and pyjson5
    • Some dependencies to include in the future (if deemed useful): json5, inotify
      • UPDATE: I went ahead and set up the tooling to pip install Python dependencies from test/suite/requirements.txt to have a native JSON5 loader in the suite. This meant updating the scdaq Dockerfile in scouting-docker-images!25 (merged) and bumping to Python 3.9, because pyjson5 needs Python >= 3.8.
    • It's possible to set up inotify as a bash command ran by the new CommandDispatcher, but it's cumbersome to keep track and remove daemon watches so I commented it out for now
  • Much improved (and more readable!) logging capabilities - color-coded, timestamped and more explicit.
  • SCDAQ outputs (stdout/stderr) are stored in full as a file
    • Prevents cases of the output exceding the limit in GitLab's CI terminal, which allows us to run SCDAQ with peak verbosity
  • Automatic upload of test artifacts
    • SCDAQ output files, stdout/err streams, configuration files etc for reference, debugging
    • Stays in GitLab for 3 days -- at around 100 MB per primitive output, artifact size becomes a storage concern
  • A make-believe version of the Function Manager to start and stop Scdaq at will
  • The advent and blessings of modular programming (give or take)
    • I am optimistic that this effort will pay off one day

Some classes:

  • FileTestSuite, ScdaqTest, myriad Configs, TestController
    • classes to manage and trigger our current batch of file-based tests
  • FileManager:
    • handles functions in the FS domain, such as:
      • hashing file contents UPDATE: moved to own class HashValidator
      • globbing (collecting) output files of a given pattern
      • moving/removing/copying files and directories etc.
  • ArtifactCollection:
    • Tracks test artifacts and produced files and stashes them somewhere safe
  • TestLogger , ColorFormatter:
    • Structured and color-coded logging for readability
      • Custom logging logger / handler / formatter
      • Includes timestamp down to the millisecond, log level, filename, line number, and functional context
      • It's very pretty
  • CommandDispatcher:
    • Nonblocking process handler for bash commands i.e. ./build/scdaq
  • FunctionManager: a simple mock-up for sending start and stop commands
    • Available either as class or command-line script
      • Needed to trigger SCDAQ runs in more recent versions, but currently bypassed as the target branch ( main ) is quite outdated!!

Possible TODOs:

  • I'd like to hash the output files in chunks (e.g., 1 MiB each), yielding something like 20-100 hashes per test. Like now, these would remain in a single file for each test. The main benefit is that we're not chained to the original output size when the reference hashes were generated, which is now 100MiB except for BMTF, which produces much sparser outputs and thus is hashed after 1MiB. This makes testing a lot more robust given that compute time varies with each execution and between the nodes that happen to pick up the job, and that even with a small subset of test cases, there are already exceptions in the expected number of hashed bytes.
    • UPDATE: Refactored the hash-checking functionality to make this change trivial in a future MR (will require re-computing ground truth hashes).
  • Unit tests and integration tests with Matteo's code (e.g. a mock of CMSSW daemons that feed the processing nodes)
  • Python environment with non-standard libraries like inotify, json5 (a proper json5 loader would be required for unit tests)
    • UPDATE: done and integrated json5 already, easier than expected
  • Mock board that sends packets over TCP to SCDAQ to cover non-file-based pipelines -- albeit sourced from a file.
Edited by Giovanna Lazzari Miotto

Merge request reports