Tensorflow Serving - ResNet

TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments. TensorFlow Serving makes it easy to deploy new algorithms and experiments, while keeping the same server architecture and APIs. Learn more about Tensorflow on their site: https://www.tensorflow.org/tfx/guide/serving****

Quick Start Guide

Running TensorFlow Serving to serve the TensorFlow ResNet model is, as usual, a single line trick.

CPU support:

curl -sfL https://get.k3ai.in | bash -s -- --cpu --plugin_tfs-resnet

GPU support:

curl -sfL https://get.k3ai.in | bash -s -- --gpu --plugin_tfs-resnet

Test the installation

For a full explanation of how to use Tensorflow Serving please take a look at the documentation site:

Step 1 - Prepare your client environment

To run any experiment against a remote inference server you have to have tensorflow-serving-api installed on your machine. As per official documentation here:https://www.tensorflow.org/tfx/serving/setup#tensorflow_serving_python_api_pip_package

As reference

pip install tensorflow-serving-api

Step 2

Clone the TensorFlow repository where we will find the test scripts

git clone https://github.com/tensorflow/serving
cd serving

Step 3

Find your cluster IP where Tensorflow Serving service is exposed

kubectl describe service tf-server-service -n tf-serving

You should have a similar output:

Name:                     tf-server-service
Namespace:                tf-serving
Labels:                   <none>
Annotations:              Selector:  app=tf-serv-resnet
Type:                     LoadBalancer
LoadBalancer Ingress:
Port:                     grpc  8500/TCP
TargetPort:               8500/TCP
NodePort:                 grpc  30525/TCP
Port:                     rest  8501/TCP
TargetPort:               8501/TCP
NodePort:                 rest  30907/TCP
Session Affinity:         None
External Traffic Policy:  Cluster

Take note of LoadBalancer Ingress IP

Step 4

We can now query the service at its external address from our local host.

Using gRPC:

python \
  tensorflow_serving/example/resnet_client_grpc.py \
  --server=<LOADBALANCER INGRESS>:8500

Using REST Api:

python \
  tensorflow_serving/example/resnet_client.py \
  --server=<LOADBALANCER INGRESS>:8501

You should have an output similar to this:

Prediction class: 286, avg latency: 87.9074 ms


Last updated