Commit bb7e92ce authored by Domenico Giordano's avatar Domenico Giordano
Browse files

inter_3

parent 5d5f8693
......@@ -11,88 +11,29 @@ We have four main types of DAGs representing the three steps in the benchmark pr
1. batch_4_always_on / shared_4_always_on: One additional kind of pipeline combines the previous two steps (ETL + production of scores in the MONIT) and run continuously.
## Our Operators know nothing!
## Our Operators know nothing about the AD algorithms!
We exploit the **BashOperator** to run a docker container, generally based on the **sparknotebook** container available in the gitlab registry of this same project.
All the intelligence implemented in the **adcern lib** runs in the *sparknotebook* container. This choice allow to run the same processes independently from Airflow. Pipelines can be therefore implemented and tested outside Airflow, and in particular can be also tested in Jupyter notebooks, if that notebook starts from the image *sparknotebook*. An example of this approach is provided in the CI [test](../../../tests/adcern/integration) of adcern.
The reason why we don't use the [official DockerOperator](https://airflow.apache.org/docs/stable/_api/airflow/operators/docker_operator/index.html) of Airflow, and we prefer to pass the *docker run* command to the BashOperator via a [script](./scripts/anomaly_detection_tasks.sh) is due to a limitation of the DockerOperator:
1. the **--log-driver** Docker attribute is not supported. We use the log-driver to configure the **fluentd logging** service that collects the anomaly detection scores. Implementing directly the docker run command we can configure . Every Docker container will log its output to the [Fluentd daemon](https://docs.docker.com/config/containers/logging/fluentd/) so that results can be then redirected wherever you want (e.g. Elasticsearch of the Cern MONIT Infrastructure).
The reason why we don't use the [official DockerOperator](https://airflow.apache.org/docs/stable/_api/airflow/operators/docker_operator/index.html) of Airflow, and we prefer to pass the *docker run* command to the BashOperator via a [script](./scripts/anomaly_detection_tasks.sh) is due to a limitation of the DockerOperator:<br>
the **--log-driver** Docker attribute is not supported. We use the log-driver to configure the **fluentd logging** service that collects the anomaly detection scores. Implementing directly the docker run command we can configure . Every Docker container will log its output to the [Fluentd daemon](https://docs.docker.com/config/containers/logging/fluentd/) so that results can be then redirected wherever you want (e.g. Elasticsearch of the Cern MONIT Infrastructure).
In order to guarantee the authentication to the **Cern Spark cluster** and to **EOS** the Kerberos ticket cache is percolated from the Airflow docker-compose to the running *sparknotebook* container.
## How to build a simple task for our DAG
Whenever you want to add a new task to your DAG that is executing a python function in a container you need to do the following steps:
1. open a python script in the **LOCAL_SCRIPTS_AREA**. We recommend to create a single script in this area to contain most of the functionalities (e.g. a python script to group all ETL related functions, one for all analysis related functions and one for all plotting related functions). Make sure it is a cli command
```python
# EXAMPLE OF MINIMAL CLI FILE
```
1. create the new fucntion and decorate it such that the Click library can do the rest to handle its arguments
```python
# EXAMPLE OF FUNCTION
```
1. create a new task in your DAG
```python
# EXAMPLE OF NEW DAG TASK
```
## Advanced topics
1. How to use Spark
1. How to pass a dictionary as arguments
## Limitations
You cannot pass string arguments with spaces, otherwise the click library will interpret the two as separate command options.
# Run on Kubernetes
To run you algorithm train on the Kubernetes (K8s) cluster you need to go through the following steps:
## Set up your cluster
1. Create a new K8s cluster. Go to openstack.cern.ch, go to the deveoper project (because we have more computational power) and create a cluster named: my-cluster-name
1. Download from the top right the .sh script to connect to that project from whatever machine (tool > Download Openstack RC V3 bla bla).
1. Place this script in your VM running the docker-compose Anomaly Detection system
1. Source it and insert your password (kinit of the account that created the cluster)
```
source <just_downloaded_script>.sh
```
1. Create the config file for the kubectl command:
```
openstack coe cluster config my-cluster-name
```
This will produce a **config** file. Note that this file is super important since it gives to whoever use it the power to command the cluster.
1. Export this path to your environment variable every time you want to manually inspect your cluster (see next section).
```
export KUBECONFIG=<absolute_path_where_you_created_the_config_file>/config
```
## Optional - Manual inspection of your cluster
If you want to monitor your cluster via command line to see what is going on you will need the following libraries on your VM:
```
pip install python-openstackclient
pip install python-magnumclient
```
Then you will be able to call commands such as:
```
openstack coe cluster list
kubectl get all
```
## Give cluster commands to Airflow
Give instruction to Airflow how to reach the cluster via the configuration file (under the hood it uses the [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command). In this way the container that run Airflow (scheduler, etc.) can access the cluster.
1. Go to the **secret.sh** file of airflow containing all pass and insert this environment variable
Steps to run algorithm train on the Kubernetes (K8s) cluster are described here
1. Set up k8s cluster, following documentation reported [here](../k8s/README.md)
2. Give instruction to Airflow how to reach the cluster via the configuration file (under the hood it uses the [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command). In this way the container that run Airflow (scheduler, etc.) can access the cluster.
<br> In the **secret.sh** file of airflow containing all pass and insert this environment variable
```
export KUBECONFIG=<absolute_path_where_you_created_the_config_file>/config
```
## Use Pod operators
In your DAG you will then be able to run all the Pod Operators you want. (Remeber to restart your docker compose so that Airflow can pick up the new secrets).
Example of pod operator in your DAG:
3. Use Pod operators<br>
In your DAG you will then be able to run all the Pod Operators you want. (Remeber to restart your docker compose so that Airflow can pick up the new secrets).<br>
- Example of pod operator in your DAG:
```python
t_bash_ls = KubernetesPodOperator(namespace='default',
image="python:3.6-stretch",
......@@ -109,7 +50,7 @@ t_bash_ls = KubernetesPodOperator(namespace='default',
)
```
Example of pod operator in your DAG using EOS.
- Example of pod operator in your DAG using EOS.
Note that you need to run a Helm chart developed by Spyros and Ricardo here (https://gitlab.cern.ch/helm/charts/cern/-/tree/master/eosxd).
```python
......@@ -156,14 +97,14 @@ t_eos_ls = KubernetesPodOperator(namespace='default',
)
```
## Run EOS in K8s
4. Run EOS in K8s
To use eos in K8s you need to have the be authenticated in the kinit.
The proper way to do that in K8s is using the secrets.
<br>
Follow these steps:
1. Install the following Helm chart on the K8s so that every container in the cluster can access the EOS folders it is allowed to see:
1. Install the following Helm chart on the K8s so that every container in the cluster can access the EOS folders it is allowed to see:
https://gitlab.cern.ch/helm/charts/cern/-/tree/master/eosxd
1. Create a secret definition a file named **user_secret.yaml** with this content:
1. Create a secret definition a file named **user_secret.yaml** with this content:
```
apiVersion: v1
kind: Secret
......@@ -174,7 +115,7 @@ https://gitlab.cern.ch/helm/charts/cern/-/tree/master/eosxd
username: <your-username-kinit>
password: <your-password>
```
1. Save the secret in the cluster k8s, by running:
1. Save the secret in the cluster k8s, by running:
```
kubectl apply -f user_secret.yaml
```
......@@ -33,7 +33,6 @@ A solution can be: spawn new containers in the Kubernetes cluster directly via t
```
kubectl get all -A
```
1. Install secrets with yaml file **user_secret.yaml**
```
apiVersion: v1
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment