Machine Learning Model Deployments using Seldon Core

In this article we will learn how we can deploy machine learning models using an open-source framework called Seldon Core

5 min readMar 13, 2023

Image source — Seldon Core Documentation

Introduction

Seldon Core is an open-source platform to deploy your machine learning models on Kubernetes at a massive scale.

Seldon core converts your ML models (Tensorflow, Pytorch, H2o, etc.) or language wrappers (Python, Java, etc.) into production REST/GRPC microservices.

Seldon handles scaling to thousands of production machine learning models and provides advanced machine learning capabilities out of the box including Advanced Metrics, Request Logging, Explainers, Outlier Detectors, A/B Tests, Canaries and more.

Prerequisites

Install necessary utility packages, here I’m using Ubuntu 20.04.5 LTS distribution.

$ sudo apt update

$ sudo apt install curl yq

Install Docker on your machine using the convenience script.

$ curl -fsSL https://get.docker.com -o get-docker.sh

$ sudo sh ./get-docker.sh

Now add your user to the docker group to execute commands without using sudo.

$ sudo usermod -aG docker $USER

Log out from your current session and log back in. Now you can execute docker commands without using sudo.

$ docker info

Install k3d, a lightweight wrapper to run k3s (Rancher Lab’s minimal Kubernetes distribution) in Docker.

$ curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash

$ k3d version

Install kubectl, a command line tool that allows you to run commands against Kubernetes clusters.

$ curl -LO https://dl.k8s.io/release/v1.24.10/bin/linux/amd64/kubectl

$ sudo install kubectl /usr/local/bin/

$ kubectl version

Install Helm, a command line tool that helps you to define, install, and upgrade even the most complex Kubernetes application.

$ curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

$ helm version

Install istioctl, a command line tool that allows service operators to debug and diagnose their Istio service mesh deployments.

$ curl -L https://istio.io/downloadIstio | sh -

$ cd istio-1.17.1/bin

$ sudo install istioctl /usr/local/bin/

$ istioctl version

Configuration

Create a k3d Kubernetes cluster using the below configuration file.

$ cat config.yml
apiVersion: k3d.io/v1alpha4
kind: Simple
metadata:
  name: seldon-core
servers: 1
agents: 2
image: rancher/k3s:v1.24.10-k3s1
ports:
- port: 30000-30100:30000-30100
  nodeFilters:
  - server:*
registries:
  create:
    name: seldon-core
    host: 0.0.0.0
    hostPort: "5000"
options:
  k3s:
    extraArgs:
    - arg: --disable=traefik
      nodeFilters:
      - server:*

$ k3d cluster create --config=config.yml

Once the cluster is created verify its status.

$ k3d cluster list 
NAME          SERVERS   AGENTS   LOADBALANCER
seldon-core   1/1       2/2      true

$ kubectl config use-context k3d-seldon-core

Switched to context "k3d-seldon-core".

$ kubectl -n kube-system get pods
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
kube-system   coredns-7b5bbc6644-krxc6                  1/1     Running   0          42s
kube-system   local-path-provisioner-687d6d7765-w8mqt   1/1     Running   0          42s
kube-system   metrics-server-667586758d-b77xx           1/1     Running   0          42s

Deploy Istio components to our cluster and verify it’s status.

$ istioctl install --set profile=demo -y
✔ Istio core installed 
✔ Istiod installed 
✔ Egress gateways installed 
✔ Ingress gateways installed 
✔ Installation complete
Making this installation the default for injection and validation.

Thank you for installing Istio 1.17.

$ kubectl -n istio-system get pods
NAME                                    READY   STATUS    RESTARTS   AGE
istiod-76cf8b7b8b-675wg                 1/1     Running   0          79s
istio-ingressgateway-8568ffc4d4-qr7pd   1/1     Running   0          53s
istio-egressgateway-8694db4556-wtcpj    1/1     Running   0          53s

Create an Istio gateway, a load balancer operating at the edge of the mesh receiving incoming or outgoing HTTP/TCP connections.

$ cat gateway.yml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: seldon-gateway
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

$ kubectl apply -f gateway.yml
gateway.networking.istio.io/seldon-gateway created

$ kubectl -n istio-system get gateways
NAME             AGE
seldon-gateway   64s

Create a new namespace for deploying Seldon Core Operator.

$ kubectl create namespace seldon-system
namespace/seldon-system created

Deploy Seldon Core Operator to our cluster using Helm.

The Seldon Core Operator is what controls your Seldon Deployments in the Kubernetes cluster. It reads the CRD definition of Seldon Deployment resources applied to the cluster and takes care that all required components like Pods and Services are created.

$ helm install seldon-core seldon-core-operator \
    --repo https://storage.googleapis.com/seldon-charts \
    --set usageMetrics.enabled=true \
    --set istio.enabled=true \
    --namespace seldon-system
NAME: seldon-core
LAST DEPLOYED: Mon Mar 13 12:25:21 2023
NAMESPACE: seldon-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

$ kubectl -n seldon-system get pods
NAME                                        READY   STATUS    RESTARTS   AGE
seldon-controller-manager-b74d66684-b4r55   1/1     Running   0          53s

Create another namespace for our model deployment.

$ kubectl create namespace seldon
namespace/seldon created

Enable Istio injection on our newly created namespace.

When you set the istio-injection=enabled label on a namespace and the injection webhook is enabled, any new pods that are created in that namespace will automatically have a sidecar added to them.

$ kubectl label namespace seldon istio-injection=enabled
namespace/seldon labeled

Deploy a pre-packaged model server for scikit-learn and verify it’s status.

$ cat iris-model.yml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: iris-model
  namespace: seldon
spec:
  name: iris
  predictors:
  - graph:
      implementation: SKLEARN_SERVER
      modelUri: gs://seldon-models/v1.15.0-dev/sklearn/iris
      name: classifier
    name: default
    replicas: 1

$ kubectl apply -f iris-model.yml 
seldondeployment.machinelearning.seldon.io/iris-model created

$ kubectl -n seldon get seldondeployments
NAME         AGE
iris-model   30s

$ kubectl -n seldon get pods
NAME                                               READY   STATUS    RESTARTS   AGE
iris-model-default-0-classifier-59bd9f6c5d-zbtqw   3/3     Running   0          4m42s

Verify the deployed model by accessing the Istio ingress gateway load balancer IP.

$ kubectl -n istio-system get svc istio-ingressgateway
NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP                        PORT(S)                                                                      AGE
istio-ingressgateway   LoadBalancer   10.43.182.71   172.19.0.2,172.19.0.3,172.19.0.4   15021:30288/TCP,80:31550/TCP,443:31363/TCP,31400:32342/TCP,15443:32413/TCP   53s

$ curl -s -X POST http://172.19.0.2/seldon/seldon/iris-model/api/v1.0/predictions \
 -H 'Content-Type: application/json' \
 -d '{ "data": { "ndarray": [[1,2,3,4]] } }' | jq
{
  "data": {
    "names": [
      "t:0",
      "t:1",
      "t:2"
    ],
    "ndarray": [
      [
        0.0006985194531162835,
        0.00366803903943666,
        0.995633441507447
      ]
    ]
  },
  "meta": {
    "requestPath": {
      "classifier": "seldonio/sklearnserver:1.15.0"
    }
  }
}