Each step defines a workload type; such as data anonymization, data generation, feature extraction, training, model evaluation, and batch inference.
To run a step, you create an execution, then that execution can be said to implement the step. Executions are heavily version-controlled so re-executing any past workloads will work as long as the Docker image and inputs still exist.
As machine learning projects are vastly different from one another, users are allowed to be as flexible as possible in building their own data science pipelines.
Usually, separate steps are defined for:
- preprocessing files and uploading them to be used by other steps
- integrating with database services to create version-controlled snapshot for training data
- executing a Python script or C code e.g. to train a predictive model
- validating if a trained model could be used for production
- deploying the trained model to staging or production
- build application binaries to be used in other steps
What is Valohai? page lists additional possible “steps” in a machine learning pipeline we have seen over the years. You can run anything that works inside a Docker container so the possibilities are seemingly endless.
You define project steps in project valohai.yaml.
Define a step in valohai.yaml
Here is an overview of the valid
name: name of the step such as “feature-extraction” or “model-training”
image: Docker image that will be used as the runtime environment
command: one or more commands that are run during execution
parameters: (optional) valid parameters that can be passed to the commands
inputs: (optional) files available during execution
environment: (optional) environment slug that specifies hardware and geolocation
environment-variables: (optional) define runtime environment variables