Floyd Config File¶
Floyd config file (floyd.yml) is a powerful tool you can use to automate and drive various workflows in FloydHub. It sits at the root directory of your project folder (directory where you ran floyd init). The config file should be written in standard YAML syntax.
Provide default arguments for floyd run¶
Instead of having to type out all the command arguments everytime you want to create a FloydHub job from floyd CLI, you can save a set of default arguments in floyd.yml. With this, all you have to type next time is just floyd run
.
For example, the following command:
floyd run --gpu --data mckay/datasets/mnist/1:mnist --env tensorflow-1.8 -m "CNN with 0.5 dropout" "python train_and_eval.py"
Can be simplified to just floyd run
if you have the following floyd.yml created inside project root directory:
machine: gpu input: - source: mckay/datasets/mnist/1 destination: mnist env: tensorflow-1.8 description: CNN with 0.5 dropout command: python train_and_eval.py
You can also override the arguments in floyd run
command to quickly test out a change. For example:
floyd run --cpu2 -m "CNN with 0.25 dropout"
is equivalent to the following with above mentioned floyd config file:
floyd run --cpu2 --data mckay/datasets/mnist/1:mnist --env tensorflow-1.8 -m "CNN with 0.25 dropout" "python train_and_eval.py"
Predefined tasks for floyd run¶
Often times, you don't just run the same run command over and over for a project. You might need to try out different command arguments for different purposes. For example, trying out a complete different algorithm, creating jobs to test a model, spinning up a serving job, etc.
To handle this use-case, the concept of task is added to floyd config file. It also makes it really easy to share and document a set of predefined tasks for a project with other collaborators.
Let's say we usually run the following commands within a project:
# create job to train a model floyd run --gpu2 --env pytorch-0.4 -m "train with lstm" --data foo/datasets/wine-reviews/1:input "train.py /floyd/input/input" # create job to test a model floyd run --gpu --env pytorch-0.4 -m "evaluate model1" --data foo/projects/nlp/1:model --data foo/datasets/wine-reviews-test/1:test "python test.py --model /floyd/input/model --data /floyd/input/test" # create job to serve a model floyd run --cpu --mode serve --env pytorch-0.4 --data foo/projects/nlp/1:model
We can capture all these use-cases in floyd config file:
env: pytorch-0.4 task: train: machine: gpu2 description: train with lstm input: - source: foo/datasets/wine-reviews/1 destination: input command: train.py /floyd/input/input test: machine: gpu description: evaluate model1 input: - source: foo/projects/nlp destination: model - source: foo/datasets/wine-reviews-test/1 destination: test command: test.py --model /floyd/input/model --data /floyd/input/test serve: machine: cpu mode: serve input: - source: foo/projects/nlp destination: model
To invoke a specific task, you can pass name of the task to floyd run
command through '--task' argument:
floyd run --task train
floyd run --task test
floyd run --task serve
Noticed that in the floyd config file, environment pytorch-0.4 is defined outside of task block globally. This is how you share configs between all the tasks so you don't have to repeatedly define the same config in each task definition.
Run on Floyd Button¶
Floyd config is also used to bootstrap projects for run on floyd button. You can read more about it in Run on Floyd Button docs.
Supported attributes¶
You can find a documented sample config with all supported attributes at this link.