Endpoint definitions are used to set up auto-scaling REST endpoints for real-time predictions.
If you are doing batch predictions that could take multiple minutes to run, we recommend sticking with normal Valohai executions. Read more about Valohai Deployment options here.
Technically, the endpoint creates a group of Docker containers running HTTP servers deployed to a Kubernetes cluster. You can either use your own Kubernetes cluster or our clusters.
You can have multiple endpoints as a single project can have various inference requirements for different contexts e.g. various teams working on the project.
The endpoints are defined in the
valohai.yaml file, which specifies how to start the inference web server e.g.
- endpoint: name: greet-me image: python:3.6 port: 8000 server-command: python run_server.py
It’s optional to use Valohai deployments. You can also choose to download the trained models and deploy them to your own existing environment, if you want more control over the deployments. Check out Valohai APIs to understand how to automate this.
Please sign in to leave a comment.