By default, Valohai will try to run as many parallel jobs as possible. Your executions get queued when you launch several executions or a Task, and as soon as a machine becomes available it's picked up.
There are a few things that might impact how many parallel executions you can run:
- Your organization's cloud quota. Each cloud provider has a quota that determines how many machines of a certain machine family can be run at a time. Your quota might be limited, especially with large GPU machines. Read more about
- Your Valohai organization's scaling settings can be set to allow a lower per-user quota. Read more about Valohai scaling settings at Manage access and quotas for different environments
- The cloud provider might be having availability issues on the environment type you specified. This happens every once in a while, especially with larger GPU instances.
In case you want to decrease the number of parallel jobs you can run, you can either:
- When creating a Task specify the Maximum queued executions count to limit the number of parallel jobs.
- Set a per-user quota for your environments under your organization settings.
Comments
0 comments
Please sign in to leave a comment.