Skip to content
Snippets Groups Projects
Commit 4d312fa0 authored by Dejan Golubovic's avatar Dejan Golubovic
Browse files

Merge branch 'multiModelExample' into 'master'

Added examples for serving multimodel in kf1.3

See merge request !4
parents 4a0a670c 8e8642c4
No related branches found
No related tags found
1 merge request!4Added examples for serving multimodel in kf1.3
# Multi model serving
Kubeflow allows you to serve multiple models from a single inference server. You will be able to make use of the same amount of resources to serve multiple models.
# Create inference service
The metadata in `inferenceservice.yaml` needs to be edited with a respective which will be used later to access your model. After that
`kubectl apply -f inferenceservice.yaml`
# Serving Models with the newly created inference service
You will be able to serve multiple models by replicating `firstmodel.yaml` and just changing the details for your model location.
Remember change the `inferenceService` name to the one you choose in the previous step.
`kubectl apply -f firstmodel.yaml`
Follow the same steps for `secondmodel.yaml`. You have now deployed 2 models on the same inference service.
# Check Status
To check for the status of these models run `kubectl get trainedmodel` you should now be able to see the 2 newly deployed models with their URLS.
**REMEMBER** The 'ready' status for both these models should be true inorder for you to access them.
# Access the models and inference service
- AUTH_SESSION is the authentication session cookie obtained in the previous step
- NAMESPACE is a personal Kubeflow namespace, which can be seen in the top left corner of the UI
- For predicting first model
```curl -H 'Cookie: authservice_session=AUTH_SESSION' -H 'Host: multi-model-sklearn-example.NAMESPACE.example.com' http://ml.cern.ch/v1/models/multimodel-first-example:predict -d @./input.json```
- For predicting second model
```curl -H 'Cookie: authservice_session=AUTH_SESSION' -H 'Host: multi-model-sklearn-example.NAMESPACE.example.com' http://ml.cern.ch/v1/models/multimodel-second-example:predict -d @./input.json```
\ No newline at end of file
apiVersion: "serving.kubeflow.org/v1alpha1"
kind: "TrainedModel"
metadata:
name: "multimodel-first-example"
spec:
inferenceService: "multi-model-sklearn-example"
model:
storageUri: "gs://kfserving-samples/models/sklearn/iris"
framework: "sklearn"
memory: "256Mi"
apiVersion: "serving.kubeflow.org/v1beta1"
kind: "InferenceService"
metadata:
name: "multi-model-sklearn-example"
spec:
predictor:
minReplicas: 1
sklearn:
name: "sklearn-mm-predict"
resources:
limits:
cpu: 100m
memory: 512Mi
requests:
cpu: 100m
memory: 512Mi
\ No newline at end of file
{
"instances": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
\ No newline at end of file
apiVersion: "serving.kubeflow.org/v1alpha1"
kind: "TrainedModel"
metadata:
name: "multimodel-second-example"
spec:
inferenceService: "multi-model-sklearn-example"
model:
storageUri: "gs://kfserving-samples/models/sklearn/iris"
framework: "sklearn"
memory: "256Mi"
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment