edit

Run a Job

Quick Look

floyd run [OPTIONS] [COMMAND]

[OPTIONS]:

[COMMAND]

Running jobs is the core action in the FloydHub workflow. A job pulls together your code and dataset(s), sends them to a deep-learning server configured with the right environment, and actually kicks off the necessary code to get the data science done. This article serves as a more in-depth look at the ins-and-outs of running jobs on FloydHub using the floyd run command.

The floyd run command can be broken down into the two main parts: the [OPTIONS], and the [COMMAND]. We'll detail all the [OPTIONS] available to you, as well as how to use the [COMMAND] properly. Use the links below as a quick reference.

Encoding issue!

This is a common issue for Windows users: This occurs when your Windows terminal is using a different encoding from the one expected by the remote machine. Here are some examples:

  1. Slash and single quote issue: floyd run \ --data alice/datasets/test \ 'python test_Sony.py' is translated as floyd run --data alice/datasets/test '\ \ \ \ '"'"'python test_Sony.py'"'"''
  2. Double quotes issue: floyd run "python test_Sony.py" is translated as floyd run ''"'"'python test_Sony.py'"'"''

Unfortunately, there isn't a silver bullet to identifying this issue, but you can use the Command view of the Job's Overview page to help debug and identify this issue. More generally, if you notice this issue, you can try to switch single quotes with double or vice versa, and remove the slash if you are indenting the commands on multiple lines - these changes usually fix the issue in 99% of the cases. If not, please reach out to us at support@floydhub.com.

Parts of the floyd run Command

[OPTIONS]

[COMMAND]

[OPTIONS]

Instance Type

To specify the instance type means to choose what kind of FloydHub instance your job will run on. Think of this as a hardware choice rather than a software one. (The software environment is declared with the Environment (--env) OPTION of floyd run command.)

You have four instance type options to choose from when running a job as detailed below:

floyd run Flag Instance Type Description
--gpu GPU Tesla K80 GPU machine
--gpu2 GPU Tesla V100 GPU machine
--cpu CPU 2 Core low perf CPU machine
--cpu2 CPU 8 Core high perf CPU machine

Important

  • The default instance flag is --cpu. This means that if you don't pass any of the above flags to floyd run, your job will be run on a CPU server.
  • If you pass more than one instance flag, this is the order of precedence: --gpu, --cpu, --gpu2, --cpu2

Dataset(s)

You can specify up to five datasources (datasets or outputs from previous jobs) to mount to the server that will be running your job. For each datasource, specify the --data flag as detailed below:

--data <name_of_datasource>:<mount_point_on_server>

For more detailed information on mounting data to jobs, see this article

Mode

FloydHub jobs can currently be run in one of two modes:

  1. --mode job (DEFAULT)
  2. --mode serve

Here is a description of each mode:

--mode job

This is the default mode so there is no need to specify --mode job when running floyd run. You can think of this mode as "regular mode". When you run your job in this mode, your code is sent up to a FloydHub deep-learning server and the [COMMAND] portion of floyd run is executed.

--mode serve

This mode is for serving your machine learning models through API endpoints. See serving document for more information.

Environment

Specifying the environment means choosing what major deep-learning software packages you want available on the server that runs your code. This is not a specification between a CPU server and a GPU server (that's the Instance Type OPTION of floyd run).

FloydHub offers servers with many different deep-learning software packages pre-installed. You can find a list of all the available environments here.

Use the --env flag to specify which environment you would like your job to run in.

Important

It is best practice to pass the entire name of the environment, including the version number, to the --env flag. For example, instead of --env tensorflow, use --env tensorflow-1.3.

Examples

$ floyd run --env tensorflow-1.3 "python train.py"
$ floyd run --env theano-0.8 "python train.py"
$ floyd run --env pytorch-0.2 "python train.py"

Message

Using --message or -m, you can specify a message that describes your job, similiar to the way a commit message describes a git commit. The job message will be displayed at various places on floydhub.com and is useful when reviewing past jobs that you'd like to iterate on.

Example:

$ floyd run -m "lorem ipsum" "echo 'hello world'"
Creating project run. Total upload size: 195.0B
Syncing code ...
[================================] 1254/1254 - 00:00:00

JOB NAME
--------------------------------
mckay/projects/message-project/1

To view logs enter:
   floyd logs mckay/projects/message-project/1

Here are some examples of where the job message will be displayed on floydhub.com:

Job Message on Job Detail Page Job Message on Project Detail Page

Maximum Runtime

The --max-runtime flag lets you set a maximum runtime duration (in seconds) for your job. If a running job exceeds its maximum runtime, FloydHub will stop the job and save any output that was generated until that point.

This feature is very useful if you want to set an upper bound to the duration of a job.

Currently there is no option to change this duration once it set at the start of the job.

Follow

The --follow flag allows you to immediately display the Logs of the running Job without launching the floyd logs command.

[COMMAND]

The [COMMAND] portion of floyd run is the command that will be executed on the server when your job begins. It will be run in the directory on the server that holds your code. To decide what to put in the [COMMAND], answer this question:

What command would I execute to kick off my code locally?

The answer to that question is what you should put into the [COMMAND] portion of floyd run.

Any valid bash command will work in the [COMMAND] portion of floyd run. For example, try these simple examples and look at their logs:

$ floyd run "pwd"
$ floyd run "ls"
$ floyd run "python -v"
$ floyd run "echo 'Hello, world!'"

Most commonly you'll be kicking off a Python script with your [COMMAND], with something like this:

$ floyd run "python train.py"

But you can feel free to get creative!

Pro Tip

Try chaining together multiple commands with && like this:

$ floyd run "bash my_setup_script.sh && python train.py"

For more examples, check out our symlinking tutorial.

Note

Serve mode (--mode serve) does not take a [COMMAND]. You'll kick off your job without passing a [COMMAND], with something like the following:

$ floyd run --env pytorch-0.2 --mode serve --data mckay/datasets/mnist/1:mount