Grid output dataset names
This is:
- A discussion on whether the names we give output datasets are optimal
- a ticket we can close when these names are documented.
On the first part, the general principal is that we save all the information needed to reproduce a dataset exactly. This is why we strip out the "human readable" simulation description in favor of e.g. a lot of production tags.
I would also prefer to have some kind of hash associated with the current state of the dumper when jobs are submitted, so that at least in theory all of the jobs can be reproduced exactly. We loose this information when branches are squashed into r22
but to me it seems worthwhile to at least keep it with the faint hope that the original branch might be preserved.
On the second part, we should not include exhaustive explanation on how to e.g. look up a hash in gitlab (we should just direct people to git / gitlab docs) but a short mapping between the fields in the output name and gitlab / ami / the atlas docs on dataset naming conventions would help.
Maybe @svanstro, @pgadow, @thuffman, or @mguth have opinions here.