Generating a Software Bill of Materials for CTA
What is a Software Bill of Materials (SBOM) and why do we care?
An SBOM is a machine-readable list of all software components, dependencies, and their versions in a project. It helps with vulnerability management, license tracking, and supply chain transparency.
Something that is very useful to us: GitLab can use SBOMs to detect known CVEs during CI/CD.
What does an SBOM file look like?
There are two common formats:
- CycloneDX (JSON/XML)
- SPDX
GitLab supports CycloneDX, and this format also seems to be the most straightforward/simplest.
An example is e.g. this sbom of cern-lhc-vdm-editor
Generating an SBOM file for C++ projects
There is a lot of automatic tooling out there to generate SBOM files when using various package managers. See e.g. this list for what GitLab supports.
The problem: we don't use a package manager. Actually, many other C++ projects have this exact same issue, so we are not alone.
Tooling support is very limited:
- Most widely-used tools do not support C++ without a package manager (Trivy, Syft)
- Other tools are not widely used and/or leave features to be desired (cmake-sbom, it-depends)
- One possible tool that is recommended is paid (RunSafe)
Possible options
-
cve-bin-tool
- Scans the binary and can generate an SBOM file in a given format
- Actively maintained by Intel: https://github.com/intel/cve-bin-tool
- Seems to have limited dependency recognition. Most likely won't recognise header-only dependencies
- We should try this and see what it produces. Perhaps we can use this and augment it with additional information
-
Generate our own manually based on the project.json
- Header-only dependencies will still need to be added manually
- Might lead to false-positives (e.g. unused dependencies still in the project.json)
- Difficult to decide what to include (run vs build dependencies, run dependencies only available in spec file)
- Most solutions analyse the source files, the build process or the produced binary. In that sense, this solution would be rather exotic. Since many other C++ projects at CERN will have similar issues, this might not be ideal.
- Will come with various challenges to ensure that we are not missing any dependencies, that we are not including too many dependencies, that there is sufficient data for each dependency etc...
The best solution will probably be a combination of the two (assuming cve-bin-tool
works decently), but the situation is not ideal.
We are not the only ones with this problem, so ideally, we find a solution that (mostly) works for other C++ projects in the department as well.