Skip to content

Write versions to output directory, check git status (and enforce a policy) before running

Pieter David requested to merge piedavid/bamboo:checkandstoregitversions into master

A few changes combined:

  • use setuptools_scm to track the exact bamboo version (and a bit of setuptools maintenance - most options can be put in setup.cfg, and this seems now recommended as it's a bit simpler)
  • write a version.yml file to the output directory. When running one of the bamboo examples that looks like this: (PyYAML sorts alphabetically, which hides a bit that the version key is the most important - the rest is additional information, mostly about whether the commit/tag has been pushed, and where - this could be reduced, but may be useful to have; when comparing versions only the version string is used)
bambooRun_args:
- --module=examples/nanozmumu.py:NanoZMuMu
- examples/test1.yml
- --plotIt=plotIt
- --backend=dataframe
bamboo_version: 0.1.0b4.dev2+g513249f
config_version: &id001
  git_common: /home/users/p/d/pdavid/bamboodev/bamboo/.git
  is_dirty: false
  remote_branches:
  - origin/checkandstoregitversions
  remotes:
    origin:
      url: ssh://git@gitlab.cern.ch:7999/piedavid/bamboo.git
      url_push: ssh://git@gitlab.cern.ch:7999/piedavid/bamboo.git
    upstream:
      url: ssh://git@gitlab.cern.ch:7999/cp3-cms/bamboo.git
      url_push: ssh://git@gitlab.cern.ch:7999/cp3-cms/bamboo.git
  sha1: 261e77c
  tag: v0.1.0b3
  tag_remotes:
  - upstream
  untracked_files: []
  version: v0.1.0b3-7-g261e77c
module_version: *id001
  • added a --git-policy bambooRun (and [git] policy bamboorc) option, to specify how picky to be (testing will still retrieve the version information, but proceed independently of the outcome - this is the default, so it is an opt-in feature)

Still to do:

  • improve handling of untracked config files and modules (current code assumes they are in a tracked directory, or otherwise in a package; if in the list of untracked files the status is always "dirty", so only --git-policy=testing will pass then; other untracked files will be listed, but ignored for the version information) done (18/05)
  • documentation (including some recommendations for analysis packages, e.g. git worktree and an editable install for the common modules - the version.yml above intentionally does not include the worktree, the module and config paths are relative to the repository root) done (18/05)
  • decide what to do when overwriting results (not --onlypost, really overwriting an output directory). Currently the versions file is left alone, but overwriting that too may be better done (18/05), overwrite but add a flag
  • should worker jobs also check for changes? This may become a mess, so I didn't put it for now, but could be done (maybe this would benefit from a "quick version check" that only runs the git describe to get the version). done, checks that the version is equal if the policy is different from testing

On the installed git versions: CentOS7 comes with 1.8.3.1 (which turns eight years old next month), but LCG_99 and higher include a recent one (2.29.2), so that's recommended, but I still added support for pre-2.7 (remote --get-url) and pre-2.5 (rev-parse --git-common-dir) versions.

In principle - if the analysis code is committed - the version.yml should include enough information to create a virtualenv that allows rerunning the exact same version that was used to create an output directory (assuming the ROOT, python, and python dependency versions do not affect the result - should those also be recorded?). If found useful I could add a script for that (this could also be left as a future extension).

Edited by Pieter David

Merge request reports