Part 3: Deploying your Data Science Containers to Kubernetes

4 min readNov 11, 2020

In the previous post, we learnt how to build a PyTorch image and running the image to train the model on your laptop using podman.

Once a model has been trained, we will now begin to package this model and deploy the model onto Kuberentes.

The repository used in this post is available here.

What is koo-burr-NET-eez?

Kubernetes (Greek for “helmsman” or “pilot” or “governor”) is an open-source system for automating deployment, scaling, and management of containerized applications. You can cluster together groups of hosts running Linux containers, and Kubernetes helps you easily and efficiently manage those clusters for running your applications. (Credit: Red Hat)

For scalable model prediction, we can deploy our trained model onto Kubernetes and let Kubernetes manages the lifecycle of the containers.

In part 2 of this series, we have a training image that trains the model, saved it on disk but we are still not doing any model prediction.

Let’s wrap the model in a Flask app and exposes it via a RESTful api. The api endpoint will accept a base64 encoded png image and transform the image to fit the model.

model = Net()model.load_state_dict(torch.load(os.getenv('model_path', './model/mnist_cnn.pt')))
model.eval()@app.route(‘/predict/’, methods=[‘GET’,’POST’])
def predict():
 img = parseImage(request.get_data()) transform=transforms.Compose([
   transforms.ToTensor(),
   transforms.Normalize((0.1307,), (0.3081,))
 ]) result = model.forward(transform(img).unsqueeze(0)) 
 num = result.max(1).indices.item()
 return jsonify({‘result’: num, ‘data’: result.tolist()[0]})

MNIST Draw Frontend

The MNIST Draw frontend is adopted from Bob Hammell and converted to ExpressJS. EJS was used to parametrize the flask backend.

var mnist_server = process.env.MNIST_SERVER || 'http://no-mnist-server-defined';console.log('Using mnist server ' + mnist_server);// index page
app.get('/', function(req, res) {
  res.render('pages/index', {
    mnist_server: mnist_server,
  });
});

Bob’s frontend allows a user to easily draw a digit on a canvas and makes a JQuery api call to the flask app which returns the probability of 0–9. In this example, you will see that the model predicted “7" with the highest score.

Deploy to Kubernetes

With the images built, we can now deploy them to Kubernetes. Prebuilt images are available from my GitHub.

Using a Kubernetes cluster with an NGINX ingress controller, I can deploy using a deployment configuration and have my app exposed to the outside world.

$ kubectl apply -f kube/mnist.yaml
service/mnist-flask created 
deployment.apps/mnist-flask created
service/mnist-draw created 
deployment.apps/mnist-draw created

In mnist.yaml, I have defined the service, deployment and ingress resources for the frontend and backend.

The Deployment consists of 1 replica pointing to my images hosted at quay.io and a Service that exposes the pods with ClusterIP, which is an internal-cluster IP.

The app is then exposed externally by using Ingress that maps my DNS A record to the Service.

$ curl -v  http://mnist-draw.demo.ltsai.com
* Rebuilt URL to: http://mnist-draw.demo.ltsai.com/
*   Trying 188.166.198.224...
* TCP_NODELAY set
* Connected to mnist-draw.demo.ltsai.com (188.166.198.224) port 80 (#0)
> GET / HTTP/1.1
> Host: mnist-draw.demo.ltsai.com
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx/1.17.10

Scaling

One of advantages of using Kubernetes is scaling. Based on my workload demand, I can easily scale my pods and is handled automatically by the Service resource.

$ kubectl scale --replicas=3 deployments/mnist-draw -n default
deployment.apps/mnist-draw scaled$ kubectl get pods -n default
NAME                           READY   STATUS              RESTARTS   AGE
mnist-draw-6ccc79c948-djj9m    0/1     ContainerCreating   0          2s
mnist-draw-6ccc79c948-nbqgd    0/1     ContainerCreating   0          2s
mnist-draw-6ccc79c948-t4ctv    1/1     Running             0          28m
mnist-flask-6f886848bc-5dzpw   1/1     Running             0          34m

Summary

In this post, we discussed how can you take your trained model and deploy them onto Kubernetes and letting Kubernetes manages the lifecycle of your containers.

In the last and final post, we will explore MLOps and how do we do this in a predictable and reproducible manner on an enterprise container platform such as Red Hat OpenShift Container Platform. OpenShift bundles Open Data Hub, a community project sponsored by Red Hat that helps to build and deploy AI workload easily on OpenShift.