The Compute and Data Layer of Valohai can be deployed to your AWS Account. This enables you to:
Use your own EC2 instances to run machine learning jobs.
Use your own S3 Buckets for storing training artifacts such as trained models, preprocessed datasets, visualizations, etc.
Access databases and data warehouses directly from the workers, which are inside your network.
Valohai doesn’t have direct access to the EC2 instances that execute the machine learning jobs. Instead, it communicates with a static EC2 instance in your AWS subscription that’s responsible for storing the job queue, job states, and short-term logs.
You can easily deploy Valohai to a fresh AWS Account using the provided CloudFormation template or by using Terraform:
If you don’t want to use the templates, you can also follow our manual step-by-step guide: Deploy to AWS manually
Below is an example of a Valohai deployment:
The example below contains RDS and RedShift as example data storage.
By default, Valohai will use only a single S3 Bucket (
valohai-data-1234) but organization admins can configure additional data stores in-app, for example, different data stores for different projects.
The example below shows only two types of workers (
g4dn.4xlarge) as examples. You will be able to use any instance types that you have an AWS quota for.
Each Valohai execution is run inside a Docker container. The base image for the execution can be fetched either from a public repository or a private repository like ECR.