PyTorch Operator
Kubeflow PyTorch-Job Training Operator
PyTorch is a Python package that provides two high-level features:
Tensor computation (like NumPy) with strong GPU acceleration
Deep neural networks built on a tape-based autograd system
You can reuse your favorite Python packages such as NumPy, SciPy, and Cython to extend PyTorch when needed. More information at https://github.com/kubeflow/pytorch-operator _**_or the PyTorch site https://pytorch.org/
Quick Start
As usual, let's deploy PyTorch with one single line command
If you leverage CPU only
curl -sfL https://get.k3ai.in | bash -s -- --cpu --plugin_pytorch-operatorif you like to use PyTorch with GPU
curl -sfL https://get.k3ai.in | bash -s -- --gpu --plugin_pytorch-operatorTest You PyTorch-Job installation
We will use the MNISE example from the Kubeflow PyTorch-Job repo at https://github.com/kubeflow/pytorch-operator/tree/master/examples/mnist****
As usual, we want to avoid complexity so we re-worked a bit the sample and make it way much more easier.
Step 1
You'll see tha in the example a container need to be created before running the sample, we merged the container commands directly in the YAML file so now it's one-click job.
For CPU only
If you have GPU enabled you may run it this way
Step 2
Check if pod are deployed correctly with
It should ouput something like this
Step 3
Check logs result of your training job
You should observe an output similar to this (since we are using 1 Master and 1 worker in this case)
Last updated
Was this helpful?