ETL Worker
Docker image for a Dashboard ETL worker - contains software performing NXCALS extractions and ETL transformations (using SciPy, Pandas etc...)
How to use
The entrypoint triggers a mvn call. Mount your python script folder as a volume to /work and start up the docker image to run py-spark You can pass arguments directly on the command line :
docker run -ti --rm -v `pwd`:/work gitlab-registry.cern.ch/industrial-controls/services/dash/worker:latest my-script.py
You can also mount /opt/nxcals-spark/work
as a persistent volume if you wish to collect the output of your build.
How to use
- Generate a keytab with :
cern-get-keytab --user --keytab nxcals.keytab
- Provide Influxdb connectivity env variables
- Provide parameters to your extraction script
- Run :
docker run -e KPRINCIPAL=$USER -v `pwd`/nxcals.keytab:/auth/private.keytab -v `pwd`/myscript.py:/opt/nxcals-spark/work/script.py etlworker
How to release
First, start a gitflow release branch and update the version to a non-SNAPSHOT:
export NEW_VERSION=<new version>
git flow release start $NEW_VERSION
mvn versions:set -DnewVersion=$NEW_VERSION
git commit -a -m "Preparing version $NEW_VERSION"
Then, refine the release as needed. When you are ready :
git flow release finish $NEW_VERSION
git push --tags origin
The release will be automatically deployed by Gitlab CI.
Once back on the develop branch, update the version and git push
mvn versions:set -DnewVersion=<new SNAPSHOT version>
git commit -a -m "Preparing next SNAPSHOT" && git push
How to build manually
docker build --build-arg FROM="cern/cc8-base" \
--build-arg SPARK_NXCALS_URL="http://photons-resources.cern.ch/downloads/nxcals_testbed/spark/spark-nxcals.zip" \
-t etlworker .