Commit 9e032dbd authored by Domenico Giordano's avatar Domenico Giordano
Browse files

duplication of what now in examples

parent 890303c7
%% Cell type:markdown id: tags:
# Test access of Spark via pySPark
%% Cell type:markdown id: tags:
This notebooks installs the data-analytics package
and tests the basic functionalities
In order to run it in Swan, follow those steps
1) pass your kerberos credentials
2) install the package, using a specific tag (in the example is qa)
3) run
%% Cell type:code id: tags:
``` python
import getpass
import os, sys
os.environ['PYTHONPATH']=os.environ['HOME']+'.local/lib/python3.6/site-packages/:'+os.environ['PYTHONPATH']
```
%% Cell type:markdown id: tags:
## Start Spark (click the "star")
%% Cell type:markdown id: tags:
## Install the package (if not done already)
%% Cell type:code id: tags:
``` python
%%bash
#install_branch=qa
#pip3 install --user git+https://:@gitlab.cern.ch:8443/cloud-infrastructure/data-analytics.git@${install_branch}
```
%% Cell type:markdown id: tags:
# Test package
%% Cell type:code id: tags:
``` python
from etl.spark_etl import cluster_utils
```
%% Cell type:code id: tags:
``` python
swan_spark_conf.getAll()
```
%% Cell type:code id: tags:
``` python
# Test connection to Spark
#sc, spark, conf = cluster_utils.set_spark(swan_spark_conf)
```
%% Cell type:code id: tags:
``` python
sc
```
%% Cell type:code id: tags:
``` python
spark
```
%% Cell type:code id: tags:
``` python
# Test stopping spark session
#cluster_utils.stop_spark(sc,spark)
```