You can define inputs when creating a new notebook execution.
An input can point a single file from a cloud data store, or to a collection of files.
Below are a few examples of possible input formats:
- https://valohaidemo.blob.core.windows.net/mnist/mnist.npz
- Downloads a single from a public location
- s3://mybucket/orders/may_2022.csv
- Downloads a single file from a private S3 Data Store that your Valohai project has access to.
- s3://mybucket/pytorch-sample/*
- Downloads all the files from the pytorch-sample folder in your private S3 Data Store that your Valohai project has access to.
You can access the inputs in your notebook from the Valohai inputs.
Using Python and the valohai-utils toolkit
import valohai # Open a single CSV file from Valohai inputs with open(valohai.inputs("input").path()) as csv_file: reader = csv.reader(csv_file, delimiter=',')
See the documentation for valohai.inputs() to see how to access multiple or an archive of files using valohai-utils.
Using Python
Start by configuring inputs to your step in valohai.yaml
and updating your code to read the data from Valohai inputs, rather than directly from your cloud storage.
# Get the location of Valohai inputs directory
VH_INPUTS_DIR = os.getenv('VH_INPUTS_DIR', '.inputs')
# Get the path to your individual inputs file
# e.g. /valohai/inputs/input/may_2022.csv
path_to_file = os.path.join(VH_INPUTS_DIR, 'input/may_2022.csv')
pd.read_csv(path_to_file)
Comments
0 comments
Please sign in to leave a comment.