arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Tensorflow Operator

Kubeflow Tensorflow-Job Training Operator

TFJob provides a Kubernetes custom resource that makes it easy to run distributed or non-distributed TensorFlow jobs on Kubernetes.

More on the Tensorflow Operator at https://github.com/kubeflow/tf-operatorarrow-up-right****

hashtag
Quick Start

All you have to run is:

hashtag
Test your installation

We present here a sample from Tensorflow Operator on ****

hashtag
Step 1

We first need to add a persistent volume and claim, to do so let's add the two YAML file we need, copy and paste each command in order.

now we add the PVC.

Note: Because we are using local-path as storage volume and we are on a single node cluster we can't use ReadWriteMany as per Rancher local-path provisioner issue __

hashtag
Step 2

Now we deploy the example

You can observe the result of the example with

It should output something similar to this (we show just partially the output here)

k3ai apply tensorflow-op
https://github.com/kubeflow/tf-operatorarrow-up-right
https://github.com/rancher/local-path-provisioner/issues/70#issuecomment-574390050arrow-up-right
kubectl apply -f - << EOF
apiVersion: v1
kind: PersistentVolume
metadata:
  name: tfevent-volume
  labels:
    type: local
    app: tfjob
spec:
  capacity:
    storage: 10Gi
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/data
EOF
kubectl apply -f - << EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: tfevent-volume
  namespace: kubeflow 
  labels:
    type: local
    app: tfjob
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
EOF
kubectl apply -f https://raw.githubusercontent.com/kubeflow/tf-operator/master/examples/v1/mnist_with_summaries/tf_job_mnist.yaml
kubectl logs -l tf-job-name=mnist -n kubeflow --tail=-1
...
Adding run metadata for 799
Accuracy at step 800: 0.957
Accuracy at step 810: 0.9698
Accuracy at step 820: 0.9676
Accuracy at step 830: 0.9676
Accuracy at step 840: 0.9677
Accuracy at step 850: 0.9673
Accuracy at step 860: 0.9676
Accuracy at step 870: 0.9654
Accuracy at step 880: 0.9694
Accuracy at step 890: 0.9708
Adding run metadata for 899
Accuracy at step 900: 0.9737
Accuracy at step 910: 0.9708
Accuracy at step 920: 0.9721
Accuracy at step 930: 0.972
Accuracy at step 940: 0.9639
Accuracy at step 950: 0.966
Accuracy at step 960: 0.9654
Accuracy at step 970: 0.9683
Accuracy at step 980: 0.9685
Accuracy at step 990: 0.9666
Adding run metadata for 999