Pipelines automate your machine learning operations on the Valohai ecosystem.
See the Pipeline Fundamentals learning path to learn how to use pipelines.
pipeline
definition has 3 required properties:
name
: name for the pipelinenodes
: list of all nodes (executions and deployments) in the pipelineedges
: list of all edges (requirements) between the nodes
A simple pipeline could look something like this:
---
- step:
name: generate-dataset
image: python:3.6
command: python preprocess.py
- step:
name: train-model
image: tensorflow/tensorflow:2.2.0-gpu
command: python train.py
inputs:
- name: dataset-images
default: http://...
- name: dataset-labels
default: http://...
- pipeline:
name: simple-pipeline
nodes:
- name: generate-node
type: execution
step: generate-dataset
- name: train-node
type: execution
step: train-model
- name: deploy-node
type: deployment
deployment: mydeployment
endpoints:
- predict-digit
edges:
- [generate-node.output.images*, train-node.input.dataset-images]
- [generate-node.output.labels*, train-node.input.dataset-labels]
- [train-node.output.model*, deploy-node.file.predict-digit.model]
- endpoint:
name: predict-digit
description: predict digits from image inputs ("file" parameter)
image: tensorflow/tensorflow:1.13.1-py3
wsgi: predict_wsgi:predict_wsgi
files:
- name: model
description: Model output file from TensorFlow
path: model.pb
Here we have a pipeline with 3 nodes, and the second node train will wait for its inputs to be generated by generate node. The third node deploys the model outputted by the train node. All files in /valohai/outputs
that start with either images
or labels
will be passed between the executions.
Override default inputs in a pipeline
In the above example:
- The
train-model
step has two inputs, each with its own default values. - The pipeline we define that the
train-model
node should use the outputs ofgenerate-dataset
as its inputs.
By default, Valohai will include both files from the default input location and the files generated by the pipeline as the step’s inputs. You can specify an override in the pipeline if instead, you want the input from the pipeline to override the default input.
Note that if you have several inputs and want to override one of them, you will need to add the others with their default values in the
override
section in the valohai.yaml
.- name: train
type: execution
step: train-model
override:
inputs:
- name: dataset-images
- name: dataset-labels
Comments
0 comments
Please sign in to leave a comment.