Skip to content
Snippets Groups Projects

Use CTest fixtures and --repeat

Merged Rosen Matev requested to merge rmatev/Gaudi:use-ctest-fixtures into master
All threads resolved!

Use FIXTURES_REQUIRED for QMTest test dependencies

For a nice introduction to FIXTURES_* in CTest see here.

FIXTURES_REQUIRED is set in addition to setting DEPENDS for ever test that defines prerequisites. Using FIXTURES_SETUP a fixture is defined for every test that is a dependency of another, where the fixture name is identical to the test name.

The main advantage of using fixtures is usability:

  1. ctest -R <regex> will also run the dependencies, unless explicitly ignored (e.g. with -FA .*).
  2. Moreover, when a dependency (fixture) fails, the dependent test is skipped ("Not Run").

CI: Use ctest --repeat instead of retrying jobs

With ctest --repeat until-pass:3, tests are run until successful or at most 3 times. Tests are retried on any failure (timeout or other).

NB: ideally we should only retry after timeouts with ctest --repeat after-timeout:3. The problem is that timeouts are handled by our test wrapper and from the CTest point of view they look like ordinary failures. There is a workaround we can do to achieve capturing stack traces and get the Timeout status but the caveat is that execution time will be reported as 0. Cleaner solutions (and the workaround) are discussed at https://gitlab.kitware.com/cmake/cmake/-/issues/17288.

Miscellaneous

  • Fix edge case in QMTest discovery (gaudi.qms/qmt_args.qmt was wrongly contracted to gaudi_args)
  • CI: Do not use deprecated globally-defined variables
  • Looser conversion from QMTest to CTest test name
  • Fail when a QMTest prerequisite test does not exist
Edited by Rosen Matev

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Author Developer

    @clemenci I'm curious what you think regarding handling timeouts.

  • Author Developer

    Some ctest examples when using fixtures.

    All tests pass
    $ ctest -T test -R root_io.coll.read -j 10
       Site: lbquantaperf02
       Build name: Linux-g++
    Create new tag: 20210313-1954 - Experimental
    Test project /home/rmatev/stackx/Gaudi/build.x86_64-centos7-gcc9-opt
        Start 253: GaudiExamples.root_io.write
    1/4 Test #253: GaudiExamples.root_io.write ........   Passed   19.34 sec
        Start 246: GaudiExamples.root_io.coll.write
        Start 252: GaudiExamples.root_io.read
    2/4 Test #246: GaudiExamples.root_io.coll.write ...   Passed    4.58 sec
    3/4 Test #252: GaudiExamples.root_io.read .........   Passed    4.73 sec
        Start 245: GaudiExamples.root_io.coll.read
    4/4 Test #245: GaudiExamples.root_io.coll.read ....   Passed    4.73 sec
    
    100% tests passed, 0 tests failed out of 4
    
    Label Time Summary:
    Gaudi            =  33.38 sec*proc (4 tests)
    GaudiExamples    =  33.38 sec*proc (4 tests)
    QMTest           =  33.38 sec*proc (4 tests)
    
    Total Test time (real) =  28.94 sec
    Ignore the fixtures with `-FA .*` (prerequisites already ran before so test passes)
    $ ctest -T test -R root_io.coll.read -j 10 -FA .*
       Site: lbquantaperf02
       Build name: Linux-g++
    Create new tag: 20210313-1956 - Experimental
    Test project /home/rmatev/stackx/Gaudi/build.x86_64-centos7-gcc9-opt
        Start 245: GaudiExamples.root_io.coll.read
    1/1 Test #245: GaudiExamples.root_io.coll.read ...   Passed   16.97 sec
    
    100% tests passed, 0 tests failed out of 1
    
    Label Time Summary:
    Gaudi            =  16.97 sec*proc (1 test)
    GaudiExamples    =  16.97 sec*proc (1 test)
    QMTest           =  16.97 sec*proc (1 test)
    
    Total Test time (real) =  17.00 sec
    Ignore the fixtures but run in a clean directory (test fails)
    ctest -T test -R root_io.coll.read -j 10 -FA .*
       Site: lbquantaperf02
       Build name: Linux-g++
    Create new tag: 20210313-1958 - Experimental
    Test project /home/rmatev/stackx/Gaudi/build.x86_64-centos7-gcc9-opt
        Start 245: GaudiExamples.root_io.coll.read
    1/1 Test #245: GaudiExamples.root_io.coll.read ...***Failed   16.19 sec
    
    0% tests passed, 1 tests failed out of 1
    
    Label Time Summary:
    Gaudi            =  16.19 sec*proc (1 test)
    GaudiExamples    =  16.19 sec*proc (1 test)
    QMTest           =  16.19 sec*proc (1 test)
    
    Total Test time (real) =  16.22 sec
    
    The following tests FAILED:
            245 - GaudiExamples.root_io.coll.read (Failed)
    Errors while running CTest
    Dependency (fixture) fails, requested test is not run
    ctest -T test -R root_io.coll.read -j 10
       Site: lbquantaperf02
       Build name: Linux-g++
    Create new tag: 20210313-2001 - Experimental
    Test project /home/rmatev/stackx/Gaudi/build.x86_64-centos7-gcc9-opt
        Start 253: GaudiExamples.root_io.write
    1/4 Test #253: GaudiExamples.root_io.write ........   Passed   18.50 sec
        Start 246: GaudiExamples.root_io.coll.write
        Start 252: GaudiExamples.root_io.read
    2/4 Test #252: GaudiExamples.root_io.read .........***Failed    2.97 sec
    3/4 Test #246: GaudiExamples.root_io.coll.write ...   Passed    4.25 sec
        Start 245: GaudiExamples.root_io.coll.read
    Failed test dependencies: GaudiExamples.root_io.read
    4/4 Test #245: GaudiExamples.root_io.coll.read ....***Not Run   0.00 sec
    
    50% tests passed, 2 tests failed out of 4
    
    Label Time Summary:
    Gaudi            =  25.72 sec*proc (4 tests)
    GaudiExamples    =  25.72 sec*proc (4 tests)
    QMTest           =  25.72 sec*proc (4 tests)
    
    Total Test time (real) =  22.82 sec
    
    The following tests FAILED:
            245 - GaudiExamples.root_io.coll.read (Not Run)
            252 - GaudiExamples.root_io.read (Failed)
    Errors while running CTest
  • Rosen Matev added 2 commits

    added 2 commits

    • 88633a6b - Looser conversion from QMTest to CTest test name
    • d79408b7 - Fail when a QMTest prerequisite test does not exist

    Compare with previous version

  • Rosen Matev changed the description

    changed the description

    • Author Developer
      Resolved by Marco Clemencic

      When extract_qmtest_metadata.py exits with a non-zero return code, we only get a CMake warning. That's inconsistent with the error handling inside the script (exits immediately when a problem is encountered).

      I think we should either have an error from CMake (easier) or have the script continue on problems.

    • Resolved by Marco Clemencic

      Let me first comment about the fixtures business.

      When using the actual QMTest scripts, the prerequisites had to pass for the dependent tests to be run. This behaviour was cause of debates as for somebody it was better to let the following tests to fail on their own. IMHO this view was due to the fact that some of the fixture setup tests we have are not checking if the setup worked, but they only check that they managed to print the same messages as in a reference file.

      I prefer the fixture based approach, but only once the tests have been reviewed and the fixture are made more reliable. In general I believe it's better if we convert the pseudo-qmtest tests into proper CTest tests, with the freedom to choose if we want just to control the order of execution or have proper fixture setup tests. This would also allow use to use CTest to control test timeouts and allow for --repeat after-timeout:3.

      About timeouts, I do not agree that we should repeat only timeouts. Transient failures could be due to infrastructure problems or non (strictly) reproducible output. Infrastructure problems could present themselves as timeouts or normal errors. Non reproducible output issues should not exist, but they are kind of unavoidable if we keep looking at the standard output of the jobs. Sometimes we have proper random deadlocks, and I would prefer not to mask them as they are due to real bugs.

      My opinion is that we should never use --repeat, but there are cases where we cannot do otherwise (at least until we fix the underlying problem), and for those I would prefer we control which tests are allowed to repeat in case of failure.

      Now one more consideration about using CTest to check for timeouts. Our test wrapper is not perfect, but I like a lot the stacktrace produced in case of timeouts, and there are other useful features. CTest is great for unit tests, but our wrapper is tuned for our integration tests, and I'm not sure we are ready to give up some of the features we have.

  • Rosen Matev added 27 commits

    added 27 commits

    • d79408b7...e4424a93 - 21 commits from branch gaudi:master
    • 199bd2da - Use FIXTURES_REQUIRED for QMTest test dependencies
    • ca75999b - Fix edge case in QMTest discovery
    • 73a7b03e - CI: Do not use deprecated globally-defined variables
    • 3552d3a7 - CI: Use ctest --repeat instead of retrying jobs
    • 685823d3 - Looser conversion from QMTest to CTest test name
    • 6e827b1e - Fail when a QMTest prerequisite test does not exist

    Compare with previous version

  • Marco Clemencic mentioned in issue #177

    mentioned in issue #177

  • Rosen Matev added 22 commits

    added 22 commits

    • 6e827b1e...a4ae2988 - 16 commits from branch gaudi:master
    • 08abc192 - Use FIXTURES_REQUIRED for QMTest test dependencies
    • fd4191ad - Fix edge case in QMTest discovery
    • 31ca0b15 - CI: Do not use deprecated globally-defined variables
    • 305d12de - CI: Use ctest --repeat instead of retrying jobs
    • b1dd6122 - Looser conversion from QMTest to CTest test name
    • f25f7143 - Fail when a QMTest prerequisite test does not exist

    Compare with previous version

  • Rosen Matev added 10 commits

    added 10 commits

    • f25f7143...903647ab - 4 commits from branch gaudi:master
    • 3aa96c87 - Use FIXTURES_REQUIRED for QMTest test dependencies
    • 4ed3f7f2 - Fix edge case in QMTest discovery
    • bd1aa174 - CI: Do not use deprecated globally-defined variables
    • 2bcf0e4f - CI: Use ctest --repeat instead of retrying jobs
    • ef74720b - Looser conversion from QMTest to CTest test name
    • c6476f31 - Fail when a QMTest prerequisite test does not exist

    Compare with previous version

    • Author Developer
      Resolved by Marco Clemencic

      I'd like to move forward with the fixtures change. We constantly have to explain that test B failed because it depends on test A that timed out (as the relationship between the tests is not obvious from the dashboard). Even when one happens to know about the test dependencies it takes time to interpret test runs with failures.

      /ci-test --merge

  • Edited by Software for LHCb
  • Marco Clemencic changed milestone to %v36r1

    changed milestone to %v36r1

  • Marco Clemencic mentioned in issue #199

    mentioned in issue #199

  • Rosen Matev changed the description

    changed the description

  • Rosen Matev added 100 commits

    added 100 commits

    • c6476f31...94b59129 - 93 commits from branch gaudi:master
    • b0ab3baf - Use FIXTURES_REQUIRED for QMTest test dependencies
    • f1ced6f0 - Fix edge case in QMTest discovery
    • 3a460c0e - CI: Do not use deprecated globally-defined variables
    • 73e1a36e - CI: Use ctest --repeat instead of retrying jobs
    • d4f126b1 - Looser conversion from QMTest to CTest test name
    • 51a1ddc9 - Fail when a QMTest prerequisite test does not exist
    • ac5e0863 - Fail CMake configuration when extract_qmtest_metadata.py fails

    Compare with previous version

  • Marco Clemencic resolved all threads

    resolved all threads

  • Marco Clemencic approved this merge request

    approved this merge request

  • Marco Clemencic mentioned in commit 348b595e

    mentioned in commit 348b595e

  • mentioned in issue #120 (closed)

  • Rosen Matev mentioned in issue #216

    mentioned in issue #216

  • Rosen Matev mentioned in merge request !1414 (closed)

    mentioned in merge request !1414 (closed)

  • Please register or sign in to reply
    Loading