Automated tagging of grid jobs
For reproducibility, a nice feature to have is to create a tag every time a significant production of training is run. This adds a CLI for submit-single-btag
with the following args:
-h: get help
-c path to config file to use
-t tag the current commit using the supplied string
-d: perform a dry run (no submit)
-f: force submit even if uncommited changes exist
A few convenience options, like overwriting the default config with -c
or adding a readable description to the job out name with -d
are provided. If the user specifies that they want the job to be tagged with -t
, the script looks for a tag associated with HEAD
. If one isn't found, a tag is created using the current date and the most recent commit hash. By default, the script does not allow the user to run -t
if uncommitted changes exist in ./training-dataset-dumper
.
Things to improve:
- It would be nice to standardise this functionality to the other submit scripts too. I know @dguest has mentioned creating a wrapper job submission script, so we could move the CLI and tagging functionality there when that is done.