Improve release cycle to allow for faster releases
As discussed, we should aim to do more frequent releases of CTA. To do this, we should have a look at the full process of creating a new CTA release and what we can do there to automate things. This ticket is meant as a general overview of the release process and the various improvements to be made to it to ensure the effort to create a new release is minimal.
Currently, the process is follows:
- Update the changelog
- Manually Trigger the
update-changelog
job in themain
branch - Manually review the entries, fix mistakes and merge the MR created by the job
- Manually Trigger the
- Create a tag
- Determine what the new version number should be
- Manually create a tag
- Run the stress test on the image created by the tag
- Ssh into stress test machine
- Run the instructions on https://gitlab.cern.ch/cta/sandbox/ci_monitoring/-/tree/master?ref_type=heads
- Wait for the stress test to complete
- Compare monitoring charts
- Publish the RPMs to a repo
- Trigger the release job in the tag pipeline
- Create a GitLab release
- Create a community forum post
Overall there are still a lot of time-consuming steps that prevent us from making these releases more frequent. Ideally, we should first aim to automate this as much as possible. The person doing the release should (for now) only have to do:
- Manually trigger the changelog update job
- Review the changelog. There are some improvements to be made here to make this check faster. E.g. checks on each MR to ensure the commits follow the correct format
- Determine the new version number. It would be good to have another commit trailer saying something like
version: major/minor/patch/packaging
. The new release version can then be automatically determined by these trailers and the developer only has to look at the latest changelog entry header. - Create a tag. This should automatically trigger the stress test and determine whether the performance is good enough. This requires additional automation; we could have e.g. a stress-test job in the CI and have this be triggered on tagged pipelines and automatically compare performance against baseline somehow (this is a bit tricky as we currently compare graphs; TBD). If the test passes, it should automatically push to the repo and create a GitLab release
- Create a community forum post. Perhaps this step can also be automated, but that has less priority as the cost/benefit ratio might not be too high
In large parts, we can divide this work in three parts:
- Improve changelog generation
- Make improvements based on comments in last release: #996
- Add automated checks to commits before merging to ensure commit template correctness
- Add commit trailer to automatically determine new CTA version number. Add clear guidelines on when to use which trailer
- Make the MR for the changelog update extremely quick so that it doesn't have to run the full pipeline
- It would be nice if this can be included in the tag pipeline, but I don't really see how as it requires adding another commit
- Automate stress test running
- Register stress test as ci-runner with custom tag
- Add stress-test job to CI and automatically trigger this on tag pipeline
- Add automated check to see if performance was good enough
- Automate release publishing
- Add new unstable repo
- Automatically push to unstable repo if stress-test passes
- Automatically create a GitLab release entry (mention unstable)
- Generate template for community forum post. If possible, do this automatically via the API: https://docs.discourse.org/#tag/Posts/operation/createTopicPostPM
The goal is to have all of this happen with minimal input from a developer. If the above works, the developer only has to trigger the changelog job, review the changelog and create a tag. This would cost < 5 min meaning we can do more frequent releases.