Prepare the repository for an Apache 2 license
This MR fixes a few issues with the code, including using the previous "full year" dataset instead of the partial 2021 one (which gets removed every month it seems).
More importantly, this also adds the Apache 2 license as has been agreed for the CARA project (of which this is an essential standalone part).