Valohai runs all executions inside a Docker container. You'll need to install any missing libraries inside an execution if a library or package doesn't exist in the Docker image you're using.
Missing Python modules
For example, for missing Python modules you'll often see an error like ModuleNotFoundError: No module named 'pandas'. To fix the error you'll need to install pandas in your executions.
You can do this by either
- Edit your
valohai.yamland add for example
pip install pandas==1.3.5in your step's command, before running your python script.
- Add pandas==1.3.5 to your requirements.txt if you're using one
In Jupyter notebooks you can add a new cell at the beginning of your notebook and add
!pip install pandas==1.3.5
You can also run any
apt-get install -y commands in your step.
For example, to install graphviz you'd follow their installation instructions:
apt-get install -y graphvizin your step's command, before running
pip install -r requirements.txtand your python script.
graphvizto your requirements.txt
Custom Docker images
In both of these cases, you're installing new packages during the execution. Depending on the size of the packages, this might take several seconds or minutes every time you ran an execution.
We strongly suggest building a customer Docker image that has all the packages you need to run your scripts. Read more on our docs.
Requirements.txt in deployments
Valohai deployments will automatically install all Python packages from your requirements file when building a new deployment endpoint.
- If you have a requirements-deployment.txt file Valohai will install all the packages from there.
- If you don't have a requirements-deployment.txt Valohai will check for a requirements.txt and if one exists, it will install all the packages from there.
--- - step: name: preprocess-dataset image: python:3.9 command: - apt-get install packagename -y - pip install -r requirements.txt - pip install numpy valohai-utils - python ./preprocess_dataset.py