Only this pageAll pages
Powered by GitBook
1 of 31

k3ai (keɪ3ai)

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

EXAMPLEs

Loading...

Loading...

Loading...

Loading...

Loading...

Plugins

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

other guides

Loading...

community

Loading...

Loading...

Loading...

Loading...

Loading...

Hello-Universe

So we got our cluster and we got our Jupyter Notebook ready. Let's practice a bit with that.

Hello

A Jupyter Notebook is an interactive python environment. It's pretty useful because you may "play" with it. Let's see a couple of examples:

Print something

  1. Open your notebook

  2. Click on the option "New" and select "Python 3"

3. In the first cell type:

Press CTRL+Enter on your keyboard you should see something similar to the image below.

If you want to learn more about how to use a Jupyter Notebook there's a great user guide on their site, just click

Hello Universe

So The cluster is up, the notebook is ready it's time to add some spice to the recipe. Whenever you want to develop a real AI application you will have to take care of three major areas: DataSets, Training, Inference.

DataSet is your data, where it comes from, how you prepare it, where you store it. We do not currently take care of this in k3ai. We focus on model training and model inference.

Model Training is about how you "teach" your mathematical model to identify specifics patterns or provide you results out of your dataset. For the sake of this guide, we will use Kubeflow pipelines for our training.

Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers -

So the flow will be this:

  1. Add Kubeflow pipelines to our existing cluster

  2. Download and use the public Kubeflow minimal pipeline example to learn how to use them

  3. Run a slightly more complex pipeline against our fresh Kubeflow pipeline environment

To add Kubeflow pipelines to our existing environment we will type in our terminal

or if we are in K3s/K30 and we want to make use of traefik

Once the deployment is done we may reach out to the pipelines UI on port 80. In case we did not use traefik we may expose the UI with this command:

Now let's proceed to download a sample notebook from the Kubeflow repo. You may grab it from

Open the notebook in raw mode, select all and save it as file with .ipynb extension (i.e.: demo.ipynb)

  1. Open your Jupyter Notebook UI (http://<YOUR CLUSTER IP>:8888)

  2. Click on Upload and load the Notebook you just saved

  3. Once the Notebook has been upload it open it

  4. Execute the first cell to install KFP SDK library. This is needed to interact with Kubeflow pipelines

and change it to

That's all.. execute all cells one after another and you'll see after the cell we just changed a result similar to this

At this point click on the "Run Details" and you should see something like this

Congratulation you have now a full pipeline running on your k3ai playground!

In another tab of your browser open the Kubeflow UI (http://<YOUR CLUSTER IP>:8888

  • Now let's move to the cell as the one below

  • here
    https://www.kubeflow.org/docs/pipelines/overview/pipelines-overview/
    here
    Jupyter notebook first steps
    Jupyter Notebook pipeline sample
    Kubeflow pipeline excution
    print ("Hello, Earth")
    k3ai apply kubeflow-pipelines
    k3ai apply -f kubeflow-pipelines-traefik
    kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
    # Specify pipeline argument values
    arguments = {'a': '7', 'b': '8'}
    # Launch a pipeline run given the pipeline function definition
    kfp.Client().create_run_from_pipeline_func(calc_pipeline, arguments=arguments, 
                                               experiment_name=EXPERIMENT_NAME)
    # The generated links below lead to the Experiment page and the pipeline run...
    # Specify pipeline argument values
    arguments = {'a': '7', 'b': '8'}
    # Launch a pipeline run given the pipeline function definition
    kfp.Client("<YOUR CLUSTER IP>").create_run_from_pipeline_func(calc_pipeline, arguments=arguments, 
                                               experiment_name=EXPERIMENT_NAME)
    # The generated links below lead to the Experiment page and the pipeline run...

    What we are trying to solve (a.k.a Our Goals)

    "The great danger for most of us lies not in setting our aim too high and falling short, but in setting our aim too low, and achieving our mark." –Michelangelo

    Identify the problem

    Artificial Intelligence platforms are complex. They combine a multitude of tools and frameworks that help Data Scientist and Data Engineers to solve the problem of building end-to-end pipelines.

    But those AI platforms, by inheritance, have a degree of complexity. Let take at the use case of some of them:

    The end goal of every organization that utilizes machine learning (ML) is to have their ML models successfully run in production and generate value to the business. But what does it take to reach that point?

    Before a model ends up in production, there are potentially many steps required to build and deploy an ML model: data loading, verification, splitting, processing, feature engineering, model training and verification, hyperparameter tuning, and model serving.

    In addition, ML models can require more observation than traditional applications, because your data inputs can drift over time. Manually rebuilding models and data sets is time consuming and error prone.

    Kubeflow project -

    See the elephant in the room? We all have to struggle with the complexity of a process that looks like the one below

    So here the first problem we identified (yes I said first): Remove the complexity and give you a straight solution.

    Now there are plenty of alternatives when it comes to the infrastructure (local infrastructure) like:

    • Minikube

    • Kind

    • Docker for Windows (Kubernetes)

    • MicroK8s

    And some of them even allow you to install some platforms like Kubeflow but.. could you cherry-picking AI tools and/or solutions and running them on top of an infrastructure that does not suck up your entire laptop RAM? Let say you start from learning the basics of training a model on different platforms and later move to learn serving models. You won't have everything running but move from one configuration to the other quickly.

    Identify the other problem

    If experimentation is one face of the coin the other is using K3ai in the context of CI/CD.

    Data Engineers, DevOps or in a more fancy definition AIOps have to face the challenge of building infrastructure pipelines that satisfy the following requirements:

    • Must be FAST to be built and EASY to be destroyed

    • Must be AVAILABLE everywhere no matter if it's on-prem, on-cloud, or in the remote universe

    • Must be REPRODUCIBLE you want to be able to replicate the scenario again and again without having every time to re-configure things from scratch

    Solving the problem

    K3ai goal is to provide a micro-infrastructure that removes the complexity of the installation, configuration, and execution of any AI platform so that the user may focus on experimentation.

    We want to satisfy the need for AI citizens and Corporate Scientists to be able to focus on what matters to them and forget the complexity attached to it.

    To do so we have to satisfy a few requirements:

    • Everything we code has to be SIMPLE enough that anybody can contribute back

    • Everything must live within ONE single command. This way may easily be integrated within any automation script

    • Everything must be MODULAR. We want to provide the greatest list of AI tools/solution ever so people may cherry-picking and create their own AI infrastructure combinations

    K3ai is for the community by the community we want to be the reference to learn, grow for AI professionals, students and researchers.

    We DO NOT install anything client-side (aka we don't want to be invasive) if not the minimal tools needed to run the solution (i.e.: k3s)
  • We want to FAST

  • We want to be LIGHTWEIGHT

  • https://www.kubeflow.org/docs/about/use-cases/
    Click on the image to zoom in/out
    Click on the image to zoom in/out

    Quick Start

    First things First

    Start by installing K3Ai, what you DON'T need to start:

    • A fancy super-duper computer/server with GPU's etc..(we managed to install everything in a 4 GB RAM laptop)

    • A cluster: don't worry we will take care of it if you don't have anything

    • Linux, Mac, Win: we got them all (and working to support ARM too)

    • Do I need to go through 1000 pages of documentation? Nah just go for the below command and move to Hello-Start

    Ready? Let's go, first pick up your flavor, we have a utility script but f for any reason it fails just go straight away to and download the binary. Place it in your path and that's it.

    NOTE: Unfortunately not all plugins work with ARM. We will take care of this and make a way to let you know before installing them

    All you have to do is, download the binary for your Operating System, move it to your path (if you like easy things), and use it.

    Linux (Including Microsoft WSL)

    once downloaded untar the file and move it to your path

    Windows

    once downloaded unzip the file and move it to your path or execute it from a folder of your choice (i.e.: k3ai.exe -h)

    Mac

    once downloaded untar the file and move it to your path

    Arm64

    once downloaded untar the file and move it to your path

    Alternative method for Linux

    If for any reason it fails just go straight away to and download the binary. Place it in your path and that's it.

    or use the following

    Note: sometimes things take longer than expected resulting in the error below:

    Don't worry! Sometimes the installation takes a few minutes, especially the Vagrant version or if you have limited bandwidth.

    Errors in the utility script? Use this (on Linux)

    Command Reference

    This the list of all command available within k3ai

    To get the list simply type k3ai -h

     k3ai is a lightweight infrastructure-in-a-box solution specifically built to
     install and configure AI tools and platforms in production environments on Edge
     and IoT devices as easily as local test environments.
    
    Usage:
      k3ai [command]
    
    Available Commands:
      apply       Apply a plugin or a plugin group
      delete      Delete a plugin or a plugin group
      help        Help about any command
      init        Initialize K3ai Client
      list        List all plugins or plugin groups
      version     Print CLI version
    
    Flags:
      -h, --help          help for k3ai-cli
          --repo string   URI for the plugins repository.  
    
    Use "k3ai-cli [command] --help" for more information about a command.

    Apply

    with apply you deploy the various plugins or group-plugins.

    Init

    With init you may deploy a cluster either local or remote

    List

    List is used to retrieve the available plugins or group-plugins

    Delete

    Delete is used to remove a deployed plugin

    k3ai use a config file to support auomated deployment of cluster through the init command.

    The sample template install by defauly KinD

    Apply a plugin or a plugin group
    
    Usage:
      k3ai apply <plugin_name> [flags]
    
    Flags:
      -g, --group   Apply a plugin group
      -h, --help    help for apply
    
    Global Flags:
          --repo string   URI for the plugins repository.  
          (default "https://api.github.com/repos/kf5i/k3ai-plugins/contents/core/")
    https://github.com/kf5i/k3ai-core/releases
    https://github.com/kf5i/k3ai-core/releases
    Initialize K3ai Client, allowing user to deploy a new K8's cluster, 
    list plugins and groups
    
    Usage:
      k3ai init [flags]
    
    Examples:
    k3ai init                                   #Will use config from $HOME/.k3ai/config.yaml and use interactive menus
    k3ai init --config /myfolder/myconfig.yaml  #Use a custom config.yaml in another location(local or remote)
    k3ai init --local k3s                       #Use config target marked local and of type k3s
    k3ai init --cloud civo                      #Use config target marked as cloud and of type civo
    
    Flags:
          --cloud string    Options availabe for cloud providers
          --config string   Custom config file [default is $HOME/.k3ai/config.yaml] (default "/.k3ai/config.yaml")
      -h, --help            help for init
          --local string    Options availabe k3s,k0s,kind
    
    Global Flags:
          --repo string   URI for the plugins repository.  
          (default "https://api.github.com/repos/kf5i/k3ai-plugins/contents/core/")
    curl -fL "https://get.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    Invoke-WebRequest -Uri "https://get-win.k3ai.in" -OutFile k3ai.zip
     Expand-Archive -Path .\k3ai.zip
    curl -fL "https://get-mac.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    curl -fL "https://get-arm.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    #Set a variable to grab latest version
    Version=$(curl -s "https://api.github.com/repos/kf5i/k3ai-core/releases/latest" | awk -F '"' '/tag_name/{print $4}' | cut -c 2-6) 
    # get the binaries
    wget https://github.com/kf5i/k3ai-core/releases/download/v$Version/k3ai-core_${Version}_linux_amd64.tar.gz
    error: timed out waiting for the condition on xxxxxxx
    #Set a variable to grab latest version
    Version=$(curl -s "https://api.github.com/repos/kf5i/k3ai-core/releases/latest" | awk -F '"' '/tag_name/{print $4}' | cut -c 2-6) 
    # get the binaries
    wget https://github.com/kf5i/k3ai-core/releases/download/v$Version/k3ai-core_${Version}_linux_amd64.tar.gz
    List all plugins or plugin groups
    
    Usage:
      k3ai list [flags]
    
    Flags:
      -g, --group   List the plugin groups
      -h, --help    help for list
    
    Global Flags:
          --repo string   URI for the plugins repository.  
          (default "https://api.github.com/repos/kf5i/k3ai-plugins/contents/core/")
    Delete a plugin or a plugin group
    
    Usage:
      k3ai delete <plugin_name> [flags]
    
    Flags:
      -g, --group   Delete a plugin group
      -h, --help    help for delete
    
    Global Flags:
          --repo string   URI for the plugins repository.  
          (default "https://api.github.com/repos/kf5i/k3ai-plugins/contents/core/")
    kind: cluster
    targetCustomizations:
    - name: localK3s #name of the cluster instance not the name of the cluster
      enabled: false
      type: k3s
      config: "/etc/rancher/k3s/k3s.yaml" #default location of config file or your existing config file to copy
      clusterName: demo-wsl-k3s #name of the cluster (this need to be the same as in a config file)
      clusterDeployment: local
      clusterStart: "sudo bash -ic 'k3s server --write-kubeconfig-mode 644 > /dev/null 2>&1 &'"
      spec:
      # If the OS is not needed may be removed so the three below are mutually exclusive, if not needed set them to null or remove it
        wsl: "https://github.com/rancher/k3s/releases/download/v1.19.4%2Bk3s1/k3s"
        mac: 
        linux: "https://get.k3s.io | K3S_KUBECONFIG_MODE=644 sh -s -"
        windows: 
        # Everything from this repo will be ran in this cluster. You trust me right?
      plugins: 
      - repo: 
        name: 
      - repo: 
        name: 
    
    - name: localK0s #name of the cluster instance not the name of the cluster
      enabled: false
      type: k0s
      config: "${HOME}/.k3ai/kubeconfig" #default location of config file or your existing config file to copy
      clusterName: demo-wsl-k0s #name of the cluster (this need to be the same as in a config file)
      clusterDeployment: local
      clusterStart: "k0s default-config | tee ${HOME}/.k3ai/k0s.yaml && sudo bash -ic 'k0s server -c ${HOME}/.k3ai/k0s.yaml --enable-worker > /dev/null 2>&1 &' && sudo cat /var/lib/k0s/pki/admin.conf > $HOME/.k3ai/k0s-config"
      spec:
      # If the OS is not needed may be removed so the three below are mutually exclusive, if not needed set them to null or remove it
        wsl: "https://github.com/k0sproject/k0s/releases/download/v0.8.1/k0s-v0.8.1-amd64"
        mac: 
        linux: "https://github.com/k0sproject/k0s/releases/download/v0.8.1/k0s-v0.8.1-amd64"
        windows:
        # Everything from this repo will be ran in this cluster. You trust me right?
      plugins: 
      - repo: 
        name: 
      - repo: 
        name: 
    
    - name: localKind #name of the cluster instance not the name of the cluster
      enabled: true
      type: kind
      config:  #default location of config file or your existing config file to copy
      clusterName: demo-win-kind #name of the cluster (this need to be the same as in a config file)
      clusterDeployment: local
      clusterStart: "kind create cluster"
      spec:
      # If the OS is not needed may be removed so the three below are mutually exclusive, if not needed set them to null or remove it
        wsl: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        mac: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-darwin-amd64"
        linux: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        windows: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-windows-amd64"
        # Everything from this repo will be ran in this cluster. You trust me right?
      plugins: 
      - repo: 
        name: jupyter-minimal
      - repo: 
        name: 
    
    - name: localK3d #name of the cluster instance not the name of the cluster
      enabled: false
      type: k3d
      config:  #default location of config file or your existing config file to copy
      clusterName: demo-win-k3d #name of the cluster (this need to be the same as in a config file)
      clusterDeployment: local
      clusterStart: "k3d cluster create"
      spec:
      # If the OS is not needed may be removed so the three below are mutually exclusive, if not needed set them to null or remove it
        wsl: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        mac: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-darwin-amd64"
        linux: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        windows: "https://github.com/rancher/k3d/releases/download/v3.4.0-test.0/k3d-windows-amd64.exe"
        # Everything from this repo will be ran in this cluster. You trust me right?
      plugins: 
      - repo: 
        name: jupyter-minimal
      - repo: 
        name: 
    
    - name: remoteK3s #name of the cluster instance not the name of the cluster
      enabled: false
      type: k3s
      config: remote #default location of config file or your existing config file to copy  if Remote will be copy from remote location
      clusterName: demo-cluster-remote #name of the cluster (this need to be the same as in a config file)
      clusterDeployment: cloud
      clusterStart: 
      spec:
      # If the OS is not needed may be removed so the three below are mutually exclusive, if not needed set them to null or remove it
        wsl: 
        mac: 
        linux:
        windows:
        cloudType: civo
        cloudNodes: 1
        cloudSecretPath: $HOME/.k3ai/secret.txt
        # Everything from this repo will be ran in this cluster. You trust me right?
      plugins: 
      - repo: "https://github.com/alfsuse/demo-plugins"
        name: "demo"
      - repo: "https://github.com/alfsuse/demo-plugins-2"
        name: "demo2"

    Hello-Start

    Learng k3ai by doing it. Here you'll find all you need to become a master in k3ai

    What I will learn?

    Let start with the basics. Here's how we structured the examples section:

    Hello-Home: this is the first step. We suppose you don't have anything set up so let's start by enabling you to install a cluster you may use and a Jupyter Notebook to work with.

    Hello-Earth: this is the second step. Let's introduce some more concepts, We will teach you how to create a basic pipeline with Kubeflow, but first, let's install it and see how to combine notebooks with pipelines.

    Hello-Universe: this is the third step. After building a pipeline, and learning to write and train a model, we move on to the inference server side. Here you'll learn to interact with the model serving use cases

    Hello-All: this is the final step. Let's add a mini web-app and create an end-to-end scenario in order to see our results live!

    Hello-All

    We are still working on this so be patient...

    Contributing

    Welcome to the K3ai project! We took the freedom to take these rules from other great OSS projects like Kubeflow, Kubernetes, and so on.‌

    Getting started as a K3ai contributor‌

    This document is the single source of truth for how to contribute to the code base. We'd love to accept your patches and contributions to this project. There are just a few small guidelines you need to follow.‌

    As you will notice we do not, currently, require any CLA signature. This may change in the future anyway but if so even that change will follow the contributing guidelines and processes.‌

    Follow the code of conduct‌

    Please make sure to read and observe our and .‌

    Joining the community‌

    Follow these instructions if you want to‌

    • Become a member of the K3ai GitHub org (see below)

    • Be recognized as an individual or organization contributing to K3ai

    ‌

    Joining the K3ai GitHub Org‌

    Before asking to join the community, we ask that you first make a small number of contributions to demonstrate your intent to continue contributing to K3ai.‌

    There are a number of ways to contribute to K3ai‌

    • Submit PRs

    • File issues reporting bugs or providing feedback

    • Answer questions on Slack or GitHub issues

    ‌

    When you are ready to join‌

    • Send a PR adding yourself as a member in ​

    • After the PR is merged an admin will send you an invitation

      • This is a manual process we are a very small team so please be patient

    ‌

    Your first contribution‌

    Find something to work on‌

    Help is always welcome! For example, documentation (like the text you are reading now) can always use improvement. There's always code that can be clarified and variables or functions that can be renamed or commented on. There's always a need for more test coverage. You get the idea - if you ever see something you think should be fixed, you should own it. Here is how you get started.‌

    Starter issues‌

    To find K3ai issues that make good entry points:‌

    • Start with issues labeled good first issue.

    • For issues that require deeper knowledge of one or more technical aspects,

      look at issues labeled help wanted.

    • Examine the issues in any of the

      ​.

    ‌Owners files and PR workflow‌

    Our PR workflow goal is to become almost nearly identical to Kubernetes'. Most of these instructions are a modified version of Kubernetes' and guides.‌

    Overview of OWNERS files‌

    Nov. 2020 We are not yet to the point where we use OWNERS and/or REVIEWERS but we plan things in advance so the below represents the idea of future workflows.‌

    OWNERS files are used to designate responsibility for different parts of the K3ai codebase. Today, we use them to assign the reviewer and approver roles used in our two-phase code review process.‌

    The velocity of a project that uses code review is limited by the number of people capable of reviewing code. The quality of a person's code review is limited by their familiarity with the code under review. Our goal is to address both of these concerns through the prudent use and maintenance of OWNERS files‌

    OWNERS‌

    Each directory that contains a unit of independent code or content may also contain an OWNERS file. This file applies to everything within the directory, including the OWNERS file itself, sibling files, and child directories.‌

    OWNERS files are in YAML format and support the following keys:‌

    • approvers: a list of GitHub usernames or aliases that can /approve a PR

    • labels: a list of GitHub labels to automatically apply to a PR

    • options: a map of options for how to interpret this OWNERS file, currently only one:

    ‌All users are expected to be assignable. In GitHub terms, this means they are either collaborators of the repo, or members of the organization to which the repo belongs.‌

    A typical OWNERS file looks like:

    ‌OWNERS_ALIASES‌

    Each repo may contain at its root an OWNERS_ALIAS file.‌

    OWNERS_ALIAS files are in YAML format and support the following keys:‌

    • aliases: a mapping of alias name to a list of GitHub usernames

    ‌We use aliases for groups instead of GitHub Teams, because changes to GitHub Teams are not publicly auditable.‌

    A sample OWNERS_ALIASES file looks like:

    ‌GitHub usernames and aliases listed in OWNERS files are case-insensitive.‌

    The code review process

    ‌The author submits a PR

    • [FUTURE]Phase 0: Automation suggests reviewers and approvers for the PR

      • Determine the set of OWNERS files nearest to the code being changed

      • Choose at least two suggested reviewers, trying to find a unique reviewer for every leaf

    ‌

    Quirks of the process

    ‌There are a number of behaviors we've observed that while possible are discouraged, as they go against the intent of this review process. Some of these could be prevented in the future, but this is the state of today.‌

    • An approver's /lgtm is simultaneously interpreted as an /approve

      • While a convenient shortcut for some, it can be surprising that the same command is interpreted

        in one of two ways depending on who the commenter is

    ‌

    Automation using OWNERS files‌

    ​​

    ‌Prow receives events from GitHub, and reacts to them. It is effectively stateless. The following pieces of prow are used to implement the code review process above.‌

    • ​​

      • per-repo configuration:

        • labels: list of labels required to be present for merge (eg:

    ‌

    Maintaining OWNERS files

    ‌OWNERS files should be regularly maintained.‌

    We encourage people to self-nominate or self-remove from OWNERS files via PR's. Ideally in the future we could use metrics-driven automation to assist in this process.‌

    We should strive to:‌

    • grow the number of OWNERS files

    • add new people to OWNERS files

    • ensure OWNERS files only contain org members and repo collaborators

    • ensure OWNERS files only contain people are actively contributing to or reviewing the code they own

    ‌

    Bad examples of OWNERS usage:‌

    • directories that lack OWNERS files, resulting in too many hitting root OWNERS

    • OWNERS files that have a single person as both approver and reviewer

    • OWNERS files that haven't been touched in over 6 months

    • OWNERS files that have non-collaborators present

    ‌Good examples of OWNERS usage:‌

    • there are more reviewers than approvers

    • the approvers are not i

    K3ai (keɪ3ai)

    K3ai is a lightweight infrastructure-in-a-box specifically built to install and configure AI tools and platforms to quickly experiment and/or run in production over edge devices.

    Ready to experiment?

    All you have to do is, download the binary for your Operating System, move it to your path (if you like easy things), and use it.

    Linux (including Microsoft WSL)

    once downloaded untar the file and move it to your path

    Windows

    once downloaded unzip the file and move it to your path or execute it from a folder of your choice (i.e.: k3ai.exe -h)

    Mac

    once downloaded untar the file and move it to your path

    Arm64

    once downloaded untar the file and move it to your path

    Alternative method for Linux

    If for any reason it fails just go straight away to and download the binary. Place it in your path and that's it.

    or use the following

    Looking for more interaction? join our Slack channel ****

    What we do support:

    • Windows

    • Linux

    • Mac

    • ARM

    NOTE: Unfortunately not all plugins work with ARM. We will take care of this and make a way to let you know before installing them

    Components of K3ai

    Currently, we install the following components (the list is changing and growing):

    • Kubernetes based on K3s from Rancher:

    • Kubernetes based on K0s from Mirantis: https://k0sproject.io

    • Kubernetes KinD:

    • Kubeflow pipelines:

    Hello-Earth

    Okay if everything worked as expected you're now a (not yet completely) happy owner. And I say not yet completely happy customer because there's not yet any AI tool there.

    So, let's do a recap, the cluster is up and running but you probably don't know:

    1. How to interact with that

    2. How to install things on it.

    First things first. Do you remember that utility we asked you to download and install named k9s? Okay now just type in your terminal k9s

    Hello-Home

    Hello

    Let's check if we got everything ready:

    NVIDIA gpu

    Quick Start Guide

    To install a GPU-enabled cluster there are few mandatory steps to prepare in advance.

    Please follow this guide from NVIDIA to install the pre-requisites:

    https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide

    Once you completed the pre-req's you may install everything with the following command:

    k3ai apply nvidia-gpu

    MPI operator (WIP)

    The MPI Operator makes it easy to run allreduce-style distributed training on Kubernetes. Please check out this blog post for an introduction to MPI Operator and its industry adoption - https://www.kubeflow.org/docs/components/training/mpi/

    To install the MPI operator simply type:

    k3ai apply mpi-op

    If a week passes without receiving an invitation reach out on k3ai#community​
  • no_parent_owners: defaults to false if not present; if true, exclude parent OWNERS files.

    Allows the use case where a/deep/nested/OWNERS file prevents a/OWNERS file from having any

    effect on a/deep/nested/bit/of/code

  • reviewers: a list of GitHub usernames or aliases that are good candidates to /lgtm a PR

  • OWNERS file, and request their reviews on the PR

  • Choose suggested approvers, one from each OWNERS file, and list them in a comment on the PR

  • Phase 1: Humans review the PR

    • Reviewers look for general code quality, correctness, sane software engineering, style, etc.

    • Anyone in the organization can act as a reviewer with the exception of the individual who

      opened the PR

    • If the code changes look good to them, a reviewer types /lgtm in a PR comment or review;

      if they change their mind, they /lgtm cancel

    • [FUTURE]Once a reviewer has /lgtm'ed, ​

      () applies an lgtm label to the PR

  • Phase 2: Humans approve the PR

    • The PR author /assign's all suggested approvers to the PR, and optionally notifies

      them (eg: "pinging @foo for approval")

    • Only people listed in the relevant OWNERS files, either directly or through an alias, can act

      as approvers, including the individual who opened the PR

    • Approvers look for holistic acceptance criteria, including dependencies with other features,

      forwards/backwards compatibility, API and flag definitions, etc

    • If the code changes look good to them, an approver types /approve in a PR comment or

      review; if they change their mind, they /approve cancel

    • ​ () updates its

      comment in the PR to indicate which approvers still need to approve

    • Once all approvers (one from each of the previously identified OWNERS files) have approved,

      ​ () applies an

      approved label

  • Phase 3: Automation merges the PR:

    • If all of the following are true:

      • All required labels are present (eg: lgtm, approved)

      • Any blocking labels are missing (eg: there is no do-not-merge/hold, needs-rebase)

    • And if any of the following are true:

      • there are no presubmit prow jobs configured for this repo

      • there are presubmit prow jobs configured for this repo, and they all pass after automatically

        being re-run one last time

    • Then the PR will automatically be merged

  • Instead, explicitly write out /lgtm and /approve to help observers, or save the /lgtm for

    a reviewer

  • This goes against the idea of having at least two sets of eyes on a PR, and may be a sign that

    there are too few reviewers (who aren't also approver)

  • Technically, anyone who is a member of the K3ai GitHub organization can drive-by /lgtm a

    PR

    • Drive-by reviews from non-members are encouraged as a way of demonstrating experience and

      intent to become a collaborator or reviewer

    • Drive-by /lgtm's from members may be a sign that our OWNERS files are too small, or that the

      existing reviewers are too unresponsive

    • This goes against the idea of specifying reviewers in the first place, to ensure that

      author is getting actionable feedback from people knowledgeable with the code

  • Reviewers, and approvers are unresponsive

    • This causes a lot of frustration for authors who often have little visibility into why their

      PR is being ignored

    • Many reviewers and approvers are so overloaded by GitHub notifications that @mention'ing

      is unlikely to get a quick response

    • If an author /assign's a PR, reviewers and approvers will be made aware of it on

      their ​

    • An author can work around this by manually reading the relevant OWNERS files,

      /unassign'ing unresponsive individuals, and /assign'ing others

    • This is a sign that our OWNERS files are stale; pruning the reviewers and approvers lists

      would help with this

    • It is the PR authors responsibility to drive a PR to resolution. This means if the PR reviewers are unresponsive they should escalate as noted below

      • e.g ping reviewers in a timely manner to get it reviewed

      • If the reviewers don't respond look at the OWNERs file in root and ping approvers listed there

  • Authors are unresponsive

    • This costs a tremendous amount of attention as context for an individual PR is lost over time

    • This hurts the project in general as its general noise level increases over time

    • Instead, close PR's that are untouched after too long (we currently have a bot do this after 30

      days)

  • lgtm
    )
  • missingLabels: list of labels required to be missing for merge (eg: do-not-merge/hold)

  • reviewApprovedRequired: defaults to false; when true, require that there must be at least

    one approved pull request review​

    present for merge

  • merge_method: defaults to merge; when squash or rebase, use that merge method instead

    when clicking a PR's merge button

  • merges PR's once they meet the appropriate criteria as configured above

  • if there are any presubmit prow jobs for the repo the PR is against, they will be re-run one

    final time just prior to merge

  • ​plugin: assign​

    • assigns GitHub users in response to /assign comments on a PR

    • unassigns GitHub users in response to /unassign comments on a PR

  • ​plugin: approve​

    • per-repo configuration:

      • issue_required: defaults to false; when true, require that the PR description link to

        an issue, or that at least one approver issues a /approve no-issue

      • implicit_self_approve: defaults to false; when true, if the PR author is in relevant

        OWNERS files, act as if they have implicitly /approve'd

    • adds the approved label once an approver for each of the required

      OWNERS files has /approve'd

    • comments as required OWNERS files are satisfied

    • removes outdated approval status comments

  • ​plugin: blunderbuss​

    • determines reviewers and requests their reviews on PR's

  • ​plugin: lgtm​

    • adds the lgtm label when a reviewer comments /lgtm on a PR

    • the PR author may not /lgtm their own PR

  • ​pkg: k8s.io/test-infra/prow/repoowners​

    • parses OWNERS and OWNERS_ALIAS files

    • if the no_parent_owners option is encountered, parent owners are excluded from having

      any influence over files adjacent to or underneath of the current OWNERS file

  • remove inactive people from OWNERS files

    Code of Conduct
    inclusivity document
    org.yaml
    K3ai repositories
    contributors
    owners
    prow
    cmd: tide

    Argo Workflows: https://github.com/argoproj/argo

  • H2O Community: https://h20.ai

  • Kubeflow: https://www.kubeflow.org/ - (coming soon)

  • NVIDIA GPU support: https://docs.nvidia.com/datacenter/cloud-native/index.html

  • NVIDIA Triton inference server: https://github.com/triton-inference-server/server/tree/master/deploy/single_server (coming soon)

  • Tensorflow Serving: https://www.tensorflow.org/tfx/serving/serving_kubernetes:

    • ResNet

    • Mnist (coming soon)

  • and many many others...

  • https://github.com/kf5i/k3ai-core/releases
    here
    https://k3s.io/
    https://kind.sigs.k8s.io/
    https://github.com/kubeflow/pipelines
    and you'll see your Kubernetes cluster appear. To close it down
    CTRL+C

    In case you saw us printing something like export=/path/path/file just copy&paste that before typing K9s. That is needed to explain to your laptop how to reach out your freshly installed cluster.

    Hello Notebooks

    In order to start playing with AI you need a workspace right? So the first way to get one is using the popular Jupyter Notebooks.

    The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. - https://jupyter.org/

    K3ai as an amazing way to get things done that is a simple rule:

    everything needs to be done in no more than 3 steps

    That's why we developed the plugins and group-plugins concept, to take shortcuts and get the job done.

    So let's take a look at the option we got first. Type the following in your terminal

    You should get a result similar to the one below

    That's the list of the current plugins we got, each one represents a single application. Now let's try a different command:

    The result will be different (something similar to):

    What we asked was: "Hey k3ai give me the list of the group plugins please". What is a group plugin? It's a combination of plugins. For example, let's take our Jupyter Notebook: we want the notebook itself but we also want to publish it through an ingress (traefik) so that we may reach out immediately to the application.

    Kubernetes Ingress builds on top of Kubernetes Services to provide load balancing at the application layer, mapping HTTP and HTTPS requests with particular domains or URLs to Kubernetes services. Ingress can also be used to terminate SSL / TLS before load balancing to the service.

    NOTE: Not all group plugins work with every cluster because not all of them deploy the same configuration to make things easier check the below table a reference.

    Ingress

    Where it works out of the box

    Traefik

    K3s, K3d

    Calico

    K0s

    Earth call Notebook

    Okay, we are ready, let's go for the notebook. Type:

    Or if you want also the ingress (like in K3s/k3d)

    Once everything has been deployed the notebook is reachable at http://<YOUR CLUSTER IP>:8888

    Note: if did not use the ingress you have first to expose the port, simply type the following in your terminal to reach it out

    Don't close the terminal, open your browser and point yourself to http://localhost:8888 or http://<YOUR CLUSTER IP>:8888

    Note: We do not use authentication for our notebook because we expect you to use if for experimentation but it's a standard configuration so in case head to How to build your first plugin section to learn how to change that

    If everything went right you should see something similar to this

    Jupyter Notebook main page
    Passion for AI in general
  • Okay if we got everything let's start. We will use k3ai-cli to do everything so you don't really have to learn how to do things over than learn k3ai. The diagram below shows you how the various k3ai-cli command will drive you through the Hello* guides

    k3ai-cli various command and their use

    First Step

    The first step is to learn how to install the cluster (unless you don't have one already). K3ai support various configurations:

    • Local deployments

    • Cloud deployments

    We make use of a configuration file to drive the various installation steps so, while the main goal of K3ai is not to deploy Kubernetes clusters we aim to make the life of our users as simple as possible.

    Note: are you an expert in automation and cluster deployment (K8s)? Help us and add some nice tooling to K3ai.

    K3ai currently support the following local clusters:

    • Rancher K3s - https://rancher.com/docs/k3s/latest/en/

    • Mirantis K0s - https://k0sproject.io/

    • KinD - https://kind.sigs.k8s.io/

    • Rancher K3d - https://k3d.io/

    On the cloud side we do offer support for:

    • Civo Cloud - https://www.civo.com/kube100

    • AWS - (Work In Progress)

    This guide will use the local installation.

    Home

    Open a terminal window and simply type the following:

    What will happen is the following:

    1. If it does not exist a folder named .k3ai will be created under your home directory (i.e.: in Linux under /home/yourusername/) inside this directory we will download a sample config.yaml file.

    2. The config.yaml has a default installation cluster: KinD that requires docker to be installed.

      1. If you don't have docker installed at this point you have to follow this guide

      2. If you don't want to use Kind just go to step 3

    3. If you got docker installed we will deploy Kind automatically and you're ready to move to Hello-Earth. If you don't want kind and/or don't want to install docker keep reading.

    Home Rebuild

    An alternative way to install a cluster and be able to choose the favorite flavor is to use a slightly different command

    Let's go into more details here's the full list of options:

    • k3ai init --local k3s

    • k3ai init --local k0s

    • k3ai init --local kind

    • k3ai init --local k3d

    In case of Cloud:

    • k3ai init --cloud civo

    Now to sum it up here's a video that shows how it works.

    Home Rebuild - Foundation

    As we mentioned at the beginning of this guide k3ai support a config file as well. The config file looks like the one below and is located at <home user folder>/.k3ai/config.yaml but k3ai support also a custom location through k3ai init --config <yourpath to config file>

    For cloud there a couple of extra configs like the one below

    Done your Hello Home is ready! You may proceed to the Hello-Earth section

    Kubeflow Pipelines

    Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.

    Quick Start Guide

    We offer to flavors: Kubeflow pipelines based on Argo Workflows and based on TektonCD engine

    #To install KF based on Argo Workflows
    k3ai apply kubeflow-pipelines
    
    #To install KF based on TektonCD
    k3ai apply kf-pipelines-tekton

    What is Kubeflow Pipelines?

    The Kubeflow Pipelines platform consists of:

    • A user interface (UI) for managing and tracking experiments, jobs, and runs.

    • An engine for scheduling multi-step ML workflows.

    • An SDK for defining and manipulating pipelines and components.

    • Notebooks for interacting with the system using an SDK.

    The following are the goals of Kubeflow Pipelines:

    • End-to-end orchestration: enabling and simplifying the orchestration of machine learning pipelines.

    • Easy experimentation: making it easy for you to try numerous ideas and techniques and manage your various trials/experiments.

    • Easy re-use: enabling you to re-use components and pipelines to quickly create end-to-end solutions without having to rebuild each time.

    Learn more on the Kubeflow website: ****

    H2O

    H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform that allows you to build machine learning models on big data and provides easy productionalization of those models in an enterprise environment. -

    In order to install H2O simply type:

    This will deploy a single node instance of the h2o platform.

    You may monitor the status of the pod with:

    To access it type:

    Point your browser to either localhost or your cluster ip (i.e.: http://localhost:54321) you should see something like this

    approvers:  
        - alice  
        - bob     
    # this is a comment
    reviewers:  
        - alice  
        - carol   
    # this is another comment  
        - sig-foo # this is an alias
    aliases:  
        sig-foo:    
            - david    
            - erin  
        sig-bar:    
            - bob    
            - frank
    curl -sfL "https://get.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    Invoke-WebRequest -Uri "https://get-win.k3ai.in" -OutFile k3ai.zip
     Expand-Archive -Path .\k3ai.zip
    curl -sfL "https://get-mac.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    curl -sfL "https://get-arm.k3ai.in" -o k3ai.tar.gz
    tar -xvzf k3ai.tar.gz \
    && chmod +x ./k3ai \
    && sudo mv ./k3ai /usr/local/bin
    #Set a variable to grab latest version
    Version=$(curl -s "https://api.github.com/repos/kf5i/k3ai-core/releases/latest" | awk -F '"' '/tag_name/{print $4}' | cut -c 2-6) 
    # get the binaries
    wget https://github.com/kf5i/k3ai-core/releases/download/v$Version/k3ai-core_${Version}_linux_amd64.tar.gz
    k3ai list
    k3ai list
    
    Name                           Description
    argo-workflow                  Argo Workflow plugin
    h2o-single                     H2O.ai
    jupyter-minimal                Minimal Jupyter Configuration
    katib                          Kubeflow Katib
    kf-pipelines-tekton            Kubeflow Pipelines based on Tekton
    kubeflow-pipelines             Kubeflow Pipelines platform agnostic
    mpi-op                         MPI-Operator
    nvidia-gpu                     Nvidia GPU support 
    pytorch-op                     Pytorch op
    tekton                         Kubeflow Pipelines based on Tekton
    tensorflow-op                  Kubeflow Tensorflow
    ...
    ...
    k3ai list -g
    k3ai list -g
    
    Name                           Description
    argo-workflow-traefik          Expose Argo Workflow using traefik HTTP
    jupyter-minimal-traefik        Expose Jupyter Notebooks using traefik HTTP
    katib-traefik                  Expose Katib using traefik HTTP
    kf-pipelines-tekton-traefik    Expose Kubeflow pipelines with Tekton backend using traefik HTTP
    kubeflow-pipelines-traefik     Expose Kubeflow Workflow using traefik HTTP
    pytorch-op-traefik             Expose Pytorch using traefik HTTP
    tensorflow-op-traefik          Expose Tensorflow using traefik HTTP
    ...
    ...
    k3ai apply jupyter-minimal
    k3ai apply -g jupyter-minimal-traefik
    kubectl port-forward -n jupyter deployment/jupyter-minimal 8888:8888
    k3ai init
    k3ai init --local <YourClusterFlavor>
    # The first two (2) lines are used to indicate what the section does. 
    # We use them to group stuff, if you need multi-cluster just copy,paste and 
    # rewrite everything after the first 2 lines
    
    kind: cluster 
    targetCustomizations: 
    # This is what you change typically: name is k3ai internal instance name,
    # enabled is to tell k3ai if you want to install it or not
    # type means what need to be installed
    # config if the cluster flavor has it's own config file (kubeconfig)
    - name: localK3s 
      enabled: false # Set it to True to enable the section
      type: k3s 
      config: "/etc/rancher/k3s/k3s.yaml" 
      clusterName: demo-wsl-k3s # This is the name of your cluster 
      clusterDeployment: local 
      # clusterStart is helpful when you install on things like WSL that do not have
      # services etc..
      clusterStart: "sudo bash -ic 'k3s server --write-kubeconfig-mode 644 ...'"
      spec:
      # If the OS is not needed may be removed so the three below
      # are mutually exclusive, if not needed set them to null or remove it
        wsl: "https://github.com/rancher/k3s/releases/download/v1.19.4%2Bk3s1/k3s"
        mac: 
        linux: "https://get.k3s.io | K3S_KUBECONFIG_MODE=644 sh -s -"
        windows: 
        # If you want to add automatically some plugins you may use the group below
      plugins: 
      - repo: #where is your plugin located?
        name: #how it is called?
      - repo: 
        name: 
    - name: remoteK3s 
      enabled: false
      type: k3s
      config: remote #currently we do not copy and merge the kubeconfig
      clusterName: demo-cluster-remote 
      clusterDeployment: cloud #change from local to cloud
      clusterStart: 
      spec:
        wsl: 
        mac: 
        linux:
        windows:
        # Cloud section
        cloudType: civo
        cloudNodes: 1
        cloudSecretPath: $HOME/.k3ai/secret.txt
        # ---end----
      plugins: 
      - repo: 
        name: 
      - repo: 
        name: 
    prow
    @k8s-ci-robot
    prow
    @k8s-ci-robot
    prow
    @k8s-ci-robot
    PR dashboard
    https://www.kubeflow.org/docs/pipelines/
    k3ai apply h2o-single
    kubectl get pod -n h2o
    
    #Output should be similar to this
    NAME                 READY   STATUS    RESTARTS   AGE
    h2o-stateful-set-0   1/1     Running   0          2m19s
     kubectl port-forward -n h2o svc/h2o-service 54321:54321
    http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html
    here

    Jupyter Notebook

    Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages. - **[https://jupyter.org/**](https://jupyter.org/)****

    We do support the current list of Jupyter Stacks as indicated in:

    https://jupyter-docker-stacks.readthedocs.io/****

    In order to run Jupyter Notebooks just run the following command:

    k3ai apply jupyter-minimal

    The following notebooks plugins are available:

    • --plugin_jupyter-minimal: (jupyter-minimal) for more information please see here****

    We are currently working on:

    • --plugin_jupyter-r: (jupyter-r-notebook) for more information please see ****

    • --plugin_jupyter-scipy: (jupyter-scipy-notebook) for more information please see ****

    • --plugin_jupyter-tf: (jupyter-tensorflow-notebook) for more information please see ****

    Remove k3ai (Work In Progress)

    We are working on this.

    In the meantime just remove the cli from:

    • Linux/WSL: /usr/local/bin

    • Windows: C:\Windows\System32

    • Mac: ideally you placed it in /usr/local/bin if not delete it from where you placed it.

    Work in Progress (do you like it? Let us know with an )

    In order to uninstall k3ai, we provide a simple command to remove all components. All you have to do is launch the following command:

    Inclusivity

    K3ai Inclusivity

    As per CoC the inclusivity document is a direct derivation of Kubeflow inclusivity document ()

    This document is a guide to inclusivity for leaders involved in the K3ai project. It describes conscious efforts and behavior patterns that contribute to an inclusive environment where K3ai members feel safe to collaborate.

    The K3ai community is a global group of contributors with a wide variety of backgrounds. As such, leaders are expected to take purposeful steps to ensure sustainability of the collaborative space. This is essential for project health and future growth. We expect all community members, and especially leaders, to practice and grow in the areas covered in this document.

    --plugin_jupyter-datascience: (jupyter-datascience-notebook) for more information please see here****

  • --plugin_jupyter-pyspark: (jupyter-pyspark-notebook) for more information please see here****

  • --plugin_jupyter-allspark: (jupyter-all-spark-notebook)for more information please see here****

  • here
    here
    here
    Carve out space

    Carve out well-defined spaces for contribution. Encourage members of the community to engage in these spaces. Reach out to specific individuals to let them know you think they would be a good fit.

    Get out of the way

    Get out of the way when someone steps up. Give them ownership and set expectations for delivery and accountability. Follow up on those expectations. If you have concerns or need more information, increase the frequency of communication rather than taking over or overstepping.

    Make opportunities

    Seek out situations that provide opportunities for members of the community. Examples of this include connecting event organizers with potential speakers, introducing leaders to individual contributors, and inviting others to collaborate. Consciously drive the creation of opportunities in areas that community members want to grow in.

    Ask members where they want to grow

    Find out which areas community members want to grow in. This could be in the form of 1:1 conversations, small groups, or weekly meetings. Ask how you can help.

    Rather than making assumptions and assigning tasks, ask people where they want to contribute and help them figure out how to make the most impact. Stretch them just enough that they can see progress and sustained growth.

    Empower members to say no

    Make it clear that members are empowered to turn down opportunities. Encourage them to define their own boundaries and give them space to assert those boundaries. Communicate that it is their responsibility to balance their commitments and that they will be supported in doing so. Before presenting a specific opportunity to an individual, provide a disclaimer that it is perfectly acceptable to say no.

    Encourage members to ask for what they want

    Encourage members of the community to make requests. That could be for improvements to the product, community, or their own personal growth. Respond to these requests with kindness and fairness.

    Ask for volunteers and make time for the community to bring up topics they care about.

    Explicitly call out challenges

    Name specific challenges that affect members of the community and state your position on how to resolve them. Make statements such as, "I understand how difficult it must be to X," and "I wish you didn't have to face such blatant challenges doing Y." Offer advice on how to deal with them or just be there to commiserate.

    Simply acknowledging the struggle is an act of empathy that makes it easier to face these challenges. This is a means of lightening the load on underrepresented groups by not requiring them to shoulder these burdens silently.

    Give clear, specific, and actionable feedback

    Be proactive about providing feedback, but ask first and be kind. Include concrete steps that can be taken to improve the outcome and steer clear of criticism involving something that cannot be reasonably changed.

    Treat everyone with respect

    Set an example and uphold that standard. Do not tolerate double standards or casual deprecation, even in jest. Ensure that community members understand the group is open to everyone.

    Follow up on complaints

    When you observe a code of conduct violation or become aware of one, follow through on enforcing community standards. Do this with care, showing respect and kindness for everyone involved. These instances have a broader impact than just the involved parties, since they set downstream expectations for the entire community.

    This is a responsibility that the Kubeflow project does not take lightly, since it directly impacts the ability of members to feel safe in the community.

    Indicators of success

    It can be difficult to assess whether these efforts are effective. In many ways, success can be invisible since it involves the prevention of conflict. A few indicators are:

    • Diverse membership across various dimensions (geographic, corporate, level of experience, etc.)

    • Presence of members from frequently marginalized groups

    • Continued engagement by long-term members

    • Sentiment within the community that ideas are heard and contributions valued

    • Accountability of leaders by members

    Attribution

    The origins of this document are an enumeration of efforts by Kubeflow project cofounder David Aronchick. This was not a solo effort and included support from Jeremy Lewi, Michelle Casbon, Edd Wilder-James, and other members of the Kubeflow team.

    here
    k3ai-cli delete
    issue

    Config File Reference

    Starting from K3ai 0.2.0 we introduced a configuration file to allow a user to deploy various flavors of Kubernetes both locally and remotely.

    The config file is typically saved in <yourhomedir>/.k3ai but could be moved around and/or hosted on other locations. We currently do not support remote config files.

    A single config file may host multiple configurations, hence is capable to deploy multiple clusters at the same time.

    Generic Section

    The kind definition at the beginning of the config file indicates that all the pieces of information below are relative to the infrastructure deployment. We did this with the intention later to have the capability to split the config behavior and possibly call other config files.

    Common Sections

    • name: is the instance name as an internal reference for K3ai. Is not currently used so it act as a placeholder right now

    • enabled: if set to true the section will be used and the cluster will be deployed

    • type: represent the cluster to be installed: k3s,k3d,k0s,kinD

    Rancher K3s Config Specifics

    Rancher K3d Config Specifics

    Mirantis K0s Config Specifics

    Do not copy the above, has been truncated to make it more readable.

    KinD Config Specifics

    Remote Clusters Specifics

    We currently support only Civo Cloud. Notice that the cloudSecretPath is a placeholder we are going to add this feature but for the time being, you'll need to pass the key through the terminal directly.

    How to Build Your First Plugin

    Building plugins for k3ai is very simple. This guide will walk you through the steps required to build a plugin and contribute to the project.

    The first step is to learn the structure of the plugins repository: ****

    The repo is structured in a very simple way.

    • core

      • groups

    kind: cluster
    targetCustomizations:
    clusterName: this is the name of the cluster (if applicable), it's useful to deploy multiple clusters of the same type
  • clusterDeployment: local or cloud. For the cloud specs see below.

  • spec: For each deployment, there are various options and binaries so dependening of where you're going to install we will take care of the right version+location. This is also useful if you want to test on a specific version.

  • plugins: repo is the URL where the plugin are hosted. If you are using the public ones you may leave it empty, name is the name of the plugin as it appears from k3ai-cli list currently we do not support groups yet in the config file.

  • plugins

  • common

  • community

  • Coreis the root folder it includes plugins and groups

    Plugins are the actual application to be deployed, for each plugin folder there is a plugin.yaml file.

    Groups are a combination of various plugins to be installed altogether

    Under common you'll find all the manifests or files needed by more than one plugin. Those are sort of reusable components (i.e.: treafik ingress definitions for plugins).

    k3ai supports custom repositories for plugins and groups so this means you may have your own instead of using our public ones.

    Let's create a local repo and deploy a "hello-world" plugin.

    First, we have to create the basic structure. So let's create anywhere on your laptop a structure like this:

    • demo

      • core

        • groups

        • plugins

          • demo-plugin

            • plugin.yaml

      • common

        • demo-plugin

          • deployment.yaml

    Now let's open the plugin.yaml file and copy the below content in it.

    Save the file and open the deployment.yaml. Copy&Paste the following content.

    We are ready let's check the plugin list with

    You should get something like this

    Great! Now let's apply the plugin to our environment

    Now let's check if the pod is running with

    We are ready so let's execute a command inside our plugin. Copy and Paste the below command into your terminal.

    If everything goes right you should see something like this

    Congratulation you created your first plugin. Now to delete it simply execute

    https://github.com/kf5i/k3ai-plugins
    kind: cluster
    targetCustomizations:
    - name: localK3s #name of the cluster instance not the name of the cluster
      enabled: false
      type: k3s
      ...
      clusterName: demo-wsl-k3s 
      clusterDeployment: local
      
      spec:
        wsl: "https://github.com/rancher/k3s/releases/download/v1.19.4%2Bk3s1/k3s"
        mac: 
        linux: "https://get.k3s.io | K3S_KUBECONFIG_MODE=644 sh -s -"
        windows: 
      plugins: 
      - repo: 
        name: 
      - repo: 
        name: 
    ...
      type: k3s
        #default location of config file or your existing config file to copy
        config: "/etc/rancher/k3s/k3s.yaml" 
    ...
      clusterStart: "sudo bash -ic 'k3s server --write-kubeconfig-mode 644 > /dev/null 2>&1 &'"
    
     type: k3d
    ...
      clusterStart: "k3d cluster create"
      spec:
        wsl: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        mac: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-darwin-amd64"
        linux: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        windows: "https://github.com/rancher/k3d/releases/download/v3.4.0-test.0/k3d-windows-amd64.exe"
    type: k0s
        #default location of config file or your existing config file to copy
        config: "${HOME}/.k3ai/kubeconfig" 
    ...
      clusterStart: "k0s default-config | tee ${HOME}/.k3ai/k0s.yaml && 
      sudo bash -ic 'k0s server -c ${HOME}/.k3ai/k0s.yaml --enable-worker > /dev/null 2>&1 &' &&
       sudo cat /var/lib/k0s/pki/admin.conf > $HOME/.k3ai/k0s-config"
      spec:
    type: kind
      config:  
    ...
      clusterStart: "kind create cluster"
      spec:
        wsl: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        mac: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-darwin-amd64"
        linux: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64"
        windows: "https://kind.sigs.k8s.io/dl/v0.9.0/kind-windows-amd64"
    enabled: false
      clusterDeployment: cloud
      clusterStart: 
      spec:
        wsl: 
        mac: 
        linux:
        windows:
        cloudType: civo
        cloudNodes: 1
        cloudSecretPath: $HOME/.k3ai/secret.txt
    plugin-name: demo-plugin
    plugin-description: Demo of a custom local plugin
    namespace: "default"
    yaml:
      - url: "./commons/demo-plugin/deployment.yaml"
        type: "file"
    apiVersion: v1
    kind: Pod
    metadata:
      name: shell-demo
    spec:
      volumes:
      - name: shared-data
        emptyDir: {}
      containers:
      - name: nginx
        image: nginx
        volumeMounts:
        - name: shared-data
          mountPath: /usr/share/nginx/html
      hostNetwork: true
      dnsPolicy: Default
    k3ai list --repo "<absolute path to root folder>/"
    
    #k3ai list --repo "home/user/core/"
    k3ai list --repo "<absolute path to root folder>/"
    
    Name                           Description
    demo-plugin                    A simple demo of a local plugin
    k3ai apply --repo "<absolute path to root folder>/"
    kubectl get pod shell-demo
    
    #Output
    NAME         READY   STATUS    RESTARTS   AGE
    shell-demo   1/1     Running   0          3m9s
    kubectl exec --stdin --tty shell-demo -- /bin/bash -c "apt-get update > /dev/null && apt-get -y install boxes > /dev/null &&  echo 'Hey, this is K3ai! Thanks for use this.' | boxes -d peek"
    /*       _\|/_
             (o o)
     +----oOO-{_}-OOo------------------------+
     |Hey, this is K3ai! Thanks for use this.|
     +--------------------------------------*/
    k3ai delete --repo "<absolute path to root folder>/"

    Argo Workflows

    Quick Start Guide

    You only have to decide if you want CPU support:

    What is Argo Workflows?

    Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD.
    • Define workflows where each step in the workflow is a container.

    • Model multi-step workflows as a sequence of tasks or capture the dependencies between tasks using a graph (DAG).

    • Easily run compute intensive jobs for machine learning or data processing in a fraction of the time using Argo Workflows on Kubernetes.

    • Run CI/CD pipelines natively on Kubernetes without configuring complex software development products.

    Learn more on the Kubeflow website: https://argoproj.github.io/projects/argo****

    k3ai apply argo-workflow

    Code of Conduct

    K3ai Code of Conduct

    This CoC has been derived from Kubeflow CoC you may read here

    Our Pledge

    In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to make participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

    Our Standards

    Examples of behavior that contributes to creating a positive environment include:

    • Using welcoming and inclusive language

    • Being respectful of differing viewpoints and experiences

    • Gracefully accepting constructive criticism

    • Focusing on what is best for the community

    Examples of unacceptable behavior by participants include:

    • The use of sexualized language or imagery and unwelcome sexual attention or advances

    • Trolling, insulting/derogatory comments, and personal or political attacks

    • Public or private harassment

    • Publishing others’ private information, such as a physical or electronic address, without explicit permission

    Our Responsibilities

    Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

    Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

    Scope

    This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

    Conflict Resolution

    We do not believe that all conflict is bad; healthy debate and disagreement often yield positive results. However, it is never okay to be disrespectful or to engage in behavior that violates the project’s code of conduct.

    If you see someone violating the code of conduct, you are encouraged to address the behavior directly with those involved. Many issues can be resolved quickly and easily, and this gives people more control over the outcome of their dispute. If you are unable to resolve the matter for any reason, or if the behavior is threatening or harassing, report it. We are dedicated to providing an environment where participants feel welcome and safe.

    Attribution

    This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at

    Showing empathy towards other community members

    Other conduct which could reasonably be considered inappropriate in a professional setting

    https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

    Tensorflow Operator

    Kubeflow Tensorflow-Job Training Operator

    TFJob provides a Kubernetes custom resource that makes it easy to run distributed or non-distributed TensorFlow jobs on Kubernetes.

    More on the Tensorflow Operator at https://github.com/kubeflow/tf-operator****

    Quick Start

    All you have to run is:

    k3ai apply tensorflow-op

    Test your installation

    We present here a sample from Tensorflow Operator on ****

    Step 1

    We first need to add a persistent volume and claim, to do so let's add the two YAML file we need, copy and paste each command in order.

    now we add the PVC.

    Note: Because we are using local-path as storage volume and we are on a single node cluster we can't use ReadWriteMany as per Rancher local-path provisioner issue __

    Step 2

    Now we deploy the example

    You can observe the result of the example with

    It should output something similar to this (we show just partially the output here)

    PyTorch Operator

    Kubeflow PyTorch-Job Training Operator

    PyTorch is a Python package that provides two high-level features:

    • Tensor computation (like NumPy) with strong GPU acceleration

    • Deep neural networks built on a tape-based autograd system

    You can reuse your favorite Python packages such as NumPy, SciPy, and Cython to extend PyTorch when needed. More information at **or the PyTorch site

    https://github.com/kubeflow/tf-operator
    https://github.com/rancher/local-path-provisioner/issues/70#issuecomment-574390050
    Quick Start

    As usual, let's deploy PyTorch with one single line command

    Test You PyTorch-Job installation

    We will use the MNISE example from the Kubeflow PyTorch-Job repo at https://github.com/kubeflow/pytorch-operator/tree/master/examples/mnist****

    As usual, we want to avoid complexity so we re-worked a bit the sample and make it way much more easier.

    Step 1

    You'll see tha in the example a container need to be created before running the sample, we merged the container commands directly in the YAML file so now it's one-click job.

    For CPU only

    If you have GPU enabled you may run it this way

    Step 2

    Check if pod are deployed correctly with

    It should ouput something like this

    Step 3

    Check logs result of your training job

    You should observe an output similar to this (since we are using 1 Master and 1 worker in this case)

    https://github.com/kubeflow/pytorch-operator
    https://pytorch.org/
    kubectl apply -f - << EOF
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: tfevent-volume
      labels:
        type: local
        app: tfjob
    spec:
      capacity:
        storage: 10Gi
      storageClassName: local-path
      accessModes:
        - ReadWriteOnce
      hostPath:
        path: /tmp/data
    EOF
    kubectl apply -f - << EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: tfevent-volume
      namespace: kubeflow 
      labels:
        type: local
        app: tfjob
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 10Gi
    EOF
    kubectl apply -f https://raw.githubusercontent.com/kubeflow/tf-operator/master/examples/v1/mnist_with_summaries/tf_job_mnist.yaml
    kubectl logs -l tf-job-name=mnist -n kubeflow --tail=-1
    ...
    Adding run metadata for 799
    Accuracy at step 800: 0.957
    Accuracy at step 810: 0.9698
    Accuracy at step 820: 0.9676
    Accuracy at step 830: 0.9676
    Accuracy at step 840: 0.9677
    Accuracy at step 850: 0.9673
    Accuracy at step 860: 0.9676
    Accuracy at step 870: 0.9654
    Accuracy at step 880: 0.9694
    Accuracy at step 890: 0.9708
    Adding run metadata for 899
    Accuracy at step 900: 0.9737
    Accuracy at step 910: 0.9708
    Accuracy at step 920: 0.9721
    Accuracy at step 930: 0.972
    Accuracy at step 940: 0.9639
    Accuracy at step 950: 0.966
    Accuracy at step 960: 0.9654
    Accuracy at step 970: 0.9683
    Accuracy at step 980: 0.9685
    Accuracy at step 990: 0.9666
    Adding run metadata for 999
    k3ai apply pytorch-op
    kubectl apply -f - << EOF
    apiVersion: "kubeflow.org/v1"
    kind: "PyTorchJob"
    metadata:
      name: "pytorch-dist-mnist-gloo"
      namespace: kubeflow
    spec:
      pytorchReplicaSpecs:
        Master:
          replicas: 1
          restartPolicy: OnFailure
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
            spec:
              containers:
                - name: pytorch
                  image: pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime
                  command: ['sh','-c','pip install tensorboardX==1.6.0 && mkdir -p /opt/mnist/src && cd /opt/mnist/src && curl -O https://raw.githubusercontent.com/kubeflow/pytorch-operator/master/examples/mnist/mnist.py && chgrp -R 0 /opt/mnist && chmod -R g+rwX /opt/mnist && python /opt/mnist/src/mnist.py']
                  args: ["--backend", "gloo"]
    
        Worker:
          replicas: 1
          restartPolicy: OnFailure
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
            spec:
              containers:
                - name: pytorch
                  image: pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime
                  command: ['sh','-c','pip install tensorboardX==1.6.0 && mkdir -p /opt/mnist/src && cd /opt/mnist/src && curl -O https://raw.githubusercontent.com/kubeflow/pytorch-operator/master/examples/mnist/mnist.py && chgrp -R 0 /opt/mnist && chmod -R g+rwX /opt/mnist && python /opt/mnist/src/mnist.py']
                  args: ["--backend", "gloo"]
    EOF
    kubectl apply -f - << EOF
    apiVersion: "kubeflow.org/v1"
    kind: "PyTorchJob"
    metadata:
      name: "pytorch-dist-mnist-gloo"
      namespace: kubeflow
    spec:
      pytorchReplicaSpecs:
        Master:
          replicas: 1
          restartPolicy: OnFailure
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
            spec:
              containers:
                - name: pytorch
                  image: pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime
                  command: ['sh','-c','pip install tensorboardX==1.6.0 && mkdir -p /opt/mnist/src && cd /opt/mnist/src && curl -O https://raw.githubusercontent.com/kubeflow/pytorch-operator/master/examples/mnist/mnist.py && chgrp -R 0 /opt/mnist && chmod -R g+rwX /opt/mnist && python /opt/mnist/src/mnist.py']
                  args: ["--backend", "gloo"]
                  # Change the value of nvidia.com/gpu based on your configuration
                  resources:
                    limits:
                      nvidia.com/gpu: 1 
        Worker:
          replicas: 1
          restartPolicy: OnFailure
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
            spec:
              containers:
                - name: pytorch
                  image: pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime
                  command: ['sh','-c','pip install tensorboardX==1.6.0 && mkdir -p /opt/mnist/src && cd /opt/mnist/src && curl -O https://raw.githubusercontent.com/kubeflow/pytorch-operator/master/examples/mnist/mnist.py && chgrp -R 0 /opt/mnist && chmod -R g+rwX /opt/mnist && python /opt/mnist/src/mnist.py']
                  args: ["--backend", "gloo"]
                  # Change the value of nvidia.com/gpu based on your configuration
                  resources:
                    limits:
                      nvidia.com/gpu: 1 
    EOF
    kubectl get pod -l pytorch-job-name=pytorch-dist-mnist-gloo -n kubeflow
    NAME                               READY   STATUS    RESTARTS   AGE
    pytorch-dist-mnist-gloo-master-0   1/1     Running   0          2m26s
    pytorch-dist-mnist-gloo-worker-0   1/1     Running   0          2m26s
     kubectl logs -l pytorch-job-name=pytorch-dist-mnist-gloo -n kubeflow
    Train Epoch: 1 [55680/60000 (93%)]      loss=0.0341
    Train Epoch: 1 [56320/60000 (94%)]      loss=0.0357
    Train Epoch: 1 [56960/60000 (95%)]      loss=0.0774
    Train Epoch: 1 [57600/60000 (96%)]      loss=0.1186
    Train Epoch: 1 [58240/60000 (97%)]      loss=0.1927
    Train Epoch: 1 [58880/60000 (98%)]      loss=0.2050
    Train Epoch: 1 [59520/60000 (99%)]      loss=0.0642
    
    accuracy=0.9660
    
    Train Epoch: 1 [55680/60000 (93%)]      loss=0.0341
    Train Epoch: 1 [56320/60000 (94%)]      loss=0.0357
    Train Epoch: 1 [56960/60000 (95%)]      loss=0.0774
    Train Epoch: 1 [57600/60000 (96%)]      loss=0.1186
    Train Epoch: 1 [58240/60000 (97%)]      loss=0.1927
    Train Epoch: 1 [58880/60000 (98%)]      loss=0.2050
    Train Epoch: 1 [59520/60000 (99%)]      loss=0.0642
    
    accuracy=0.9660

    Civo Cloud

    Civo was born when our small team first came together to create an OpenStack-based cloud for a shared hosting provider. Read their story here: https://www.civo.com/blog/kube100-so-far

    In 2019 we went all in and took Civo in a new direction, launching the world’s first k3s-powered, managed Kubernetes service into beta.

    As easy as can be, K3ai works perfectly on Civo. Here it is the simplest guide ever to run k3ai on Civo - three steps and your k3ai is ready!

    Installing k3ai on Civo

    Ready? It requires less than 5 minutes!

    You'll need an account on Civo.com. To do so simply register on Civo here:

    ****

    Installing

    All you have to do is simply type:

    Wait for the instance to finish the deployment

    Step 2

    Download the kubeconfig file, move it to your preferred location, and set your environment to use it:

    Step 3

    One last thing and then we're done:

    enjoy your k3ai on****

    https://www.civo.com/signup
    https://civo.com
    k3s deployment
    k3ai init --cloud civo
    kubectl config --kubeconfig="civo-k3ai-kubeconfig"
     k3ai apply <your favorite plugin>

    Contributors

    If you want to see your name on the page just create a PR listing your contributions and we will be happy to add you.

    K3ai Inventors

    • Alessandro Festa

      • GitHub: @alfsuse

      • Twitter: @bringyourownai

      • Org: SUSE

    • Gabriele Santomaggio

      • GitHub: @Gsantomaggio

      • Twitter: @GSantomaggio

      • Org: SUSE

    Design and documentation

    • Kenneth Wimer

      • GitHub: @kwwii

      • Twitter: @kwwii

      • Org: SUSE

    Contributors

    • Saiyam Pathak

      • Github: @saiyam1814

      • Twitter: @SaiyamPathak

      • Org: CIVO Cloud

    Harsimran Singh Maan

    • Github: @harsimranmaan

    • Author of: Splunk ML Environment (SMLE) Labs Beta

    • Org: Splunk

    Roadmap

    We maintain the public roadmap at: https://github.com/orgs/kf5i/projects/2

    The below roadmap is a short version of what we release based on monthly updates

    December 2020

    Released

    • Init command to create local and remote clusters

    • Support for Mirantis K0s clusters

    • Support for KinD clusters

    • Support for Rancher K3d clusters

    October 2020

    Released

    • Kubeflow pipelines support

    • Argo Workflows support

    • GPU support

    • Tensorflow Serving for ResNet models support

    Planned

    • WSL improvements

    • Cloud Deployment initial support

    • Tensorflow Serving - MNIST support

    • KFP SDK support

    Support for Rancher K3s clusters

  • Kubectl is not the default command used

  • Support for Civo automated cluster creation

  • GPU plugin for V2

  • H2O support for V2

  • Windows Subsystem for Linux support

  • Civo Cloud support

  • Jupyter Notebooks support

    Seen in the wild

    Sites that mention k3ai. If you spoke about k3ai and want to be listed let us know.

    • Civo Cloud Blog: https://www.civo.com/learn/running-kubeflow-pipelines

    • Coffe and Cloud Native E51: https://youtu.be/I02GIAKwLBU?t=1314

    • Kubeflow Official Documentation: https://www.kubeflow.org/docs/pipelines/installation/localcluster-deployment/#k3ai-alpha

    • Cloud native Samachaar - Ep2:

    • Europe Cloud Conference 2020:

    • All Things Open 2020:

    • HubStation October Blog:

    https://www.youtube.com/watch?v=K5jJRo1jKcY&feature=youtu.be
    https://youtu.be/NliBAWoe5ns?list=PLWWckTbWjUOEf-SqtQA_lc5fwC_67Scst
    https://www.youtube.com/watch?v=1isURnUkYGY
    https://hubstation.github.io/newsletter/2020/10/31/october.html