Jobflow Remote quickstart#

After completing the Setup and installation, it is possible to start submitting Flow for execution. If you are not familiar with the concept of Job and Flow in jobflow you can start checking its tutorials.

Any jobflow’s Flow can be executed with jobflow-remote, but, at variance with jobflow simple examples, the Job functions should be serializable and accessible by the runner. Simple custom examples based on functions defined on the fly cannot thus be used. For this reason a few simple Job s have been prepared for test purposes in the jobflow_remote.utils.examples module.

For the execution of the following tutorial it may be convenient to define a simple worker with type: local and scheduler_type: shell to speed up the execution, but any worker is acceptable.

Submit a Flow#

To run a workflow with jobflow-remote the first step is to insert it into the database. A Flow can be created following the standard jobflow procedure. Then it should be passed to the submit_flow function:

from jobflow_remote.utils.examples import add
from jobflow_remote import submit_flow
from jobflow import Flow

job1 = add(1, 2)
job2 = add(job1.output, 2)

flow = Flow([job1, job2])

print(submit_flow(flow, worker="local_shell"))

This code will print an integer unique id associated to the submitted Job s.

Note

In addition to the uuid, the standard jobflow’s identifier for Jobs, jobflow-remote also defines an incremental db_id, to help quickly identify different Jobs. A db_id uniquely identifies each Job entry in jobflow-remote’s queue database. The same entry is also uniquely identified by the uuid, index pair.

Note

On the worker selection:
  • The worker should match the name of one of the workers defined in the project.

  • In this way all the Job s will be assigned to the same worker.

  • If the argument is omitted the first worker in the project configuration is used.

  • In any case the worker is determined when the Job is inserted in the database.

Warning

Once the flow has been submitted to the database, any further change to the Flow object will not be taken into account.

It is now possible to use the jf command line interface (CLI):

jf job list

to display the list of Job s in the database:

                                               Jobs info
┏━━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ DB id ┃ Name ┃ State   ┃ Job id  (Index)                           ┃ Worker      ┃ Last updated [CET] ┃
┡━━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ 2     │ add  │ WAITING │ 8b7a7841-37c7-4446-853b-ad3c00eb5227  (1) │ local_shell │ 2023-12-19 16:33   │
│ 1     │ add  │ READY   │ ae020c67-72f0-4805-858e-fe48644e4bb0  (1) │ local_shell │ 2023-12-19 16:33   │
└───────┴──────┴─────────┴───────────────────────────────────────────┴─────────────┴────────────────────┘

Note

It is possible to use the -v flag to increase the verbosity of the output. Use -vv or -vvv to further increase the the verbosity.

It is also possible to filter and sort* the results. run jf job list -h to see the available options.

One of the Jobs is in the READY state, signaling that it is ready to be executed. The second Job is instead in the WAITING state since it will not start until the first reaches the COMPLETED state. At this point nothing will happen, since the process to handle the Jobs has not been started yet.

The Runner#

Jobflow-remote’s Runner is an object that takes care of handling the submitted Job s. It performs several actions to advance the state of the workflows. For each Job it:

  • Copies files to and from the WORKER (i.e. Job ‘s inputs and outputs)

  • Interacts with the WORKER’s queue manager (e.g. SLURM, PBS, …), submitting jobs and checking their state

  • Updates the content of the database

Only the actual execution of the Jobs in the Workers are disconnected from the Runner. In all the other cases, the state of the Job s can change only if the Runner is running.

The standard way to execute the Runner is through a daemon process that can be started with the jf CLI:

jf runner start

Since the process starts in the background, you can check that it properly started with the command:

jf runner status

If the Runner started correctly you should get:

Daemon status: running

During the execution of the Job it is possible to check their status as done before:

                                                 Jobs info
┏━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ DB id ┃ Name ┃ State     ┃ Job id  (Index)                           ┃ Worker      ┃ Last updated [CET] ┃
┡━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ 2     │ add  │ RUNNING   │ 8b7a7841-37c7-4446-853b-ad3c00eb5227  (1) │ local_shell │ 2023-12-19 16:44   │
│ 1     │ add  │ COMPLETED │ ae020c67-72f0-4805-858e-fe48644e4bb0  (1) │ local_shell │ 2023-12-19 16:44   │
└───────┴──────┴───────────┴───────────────────────────────────────────┴─────────────┴────────────────────┘

Note

The Runner checks the states of the Jobs at regular intervals. A few seconds may be required to have a change in the Job state.

The Runner will keep checking the database for the submission of new Jobs and will update the state of each Job as soon as the previous action is completed. If you plan to keep submitting workflows you can keep the daemon running, otherwise you can stop the process with:

jf runner stop

Note

By default the daemon will spawn several processes, each taking care of some of the actions listed above.

Warning

The stop command will send a SIGTERM command to the Runner processes, that will terminate the action currently being performed before actually stopping. This should prevent the presence on inconsistent states in the database. However, if you believe the Runner is stuck or need to halt the Runner immediately you can kill the processes with:

jf runner kill

Results#

As in standard jobflow execution, when a Job is COMPLETED its output is stored in the defined JobStore. For simple cases like the one used in this example the outputs can be fetched directly using the CLI:

jf job output 2

That should print the expected result:

5

Note

The CLI commands that accept a single Job id, both the uuid or the db_id can be passed. The code will automatically determine the

For more advanced workflows, the best way to obtain the results is using the JobStore, as done with usual jobflow’s outputs. For jobflow-remote, a convenient way to access the JobStore in python is to use the get_jobstore helper function.

from jobflow_remote import get_jobstore

js = get_jobstore()
js.connect()

print(js.get_output("8b7a7841-37c7-4446-853b-ad3c00eb5227"))

CLI#

On top of the CLI commands shown above a full list of the commands, sub-commands options available is accessible through the -h flag. Here we present a few more of them that can be useful to get started.

Job info#

Detailed information from a Job can be obtained running the command:

jf job info 2

that prints a summary of the content of the Job document in the DB:

╭─────────────────────────────────────────────────────────────────────────────────────────────╮
│ created_on = '2023-12-19 16:33'                                                             │
│      db_id = 2                                                                              │
│   end_time = '2023-12-19 16:44'                                                             │
│      index = 1                                                                              │
│   metadata = {}                                                                             │
│       name = 'add'                                                                          │
│    parents = ['ae020c67-72f0-4805-858e-fe48644e4bb0']                                       │
│   priority = 0                                                                              │
│     remote = {'step_attempts': 0, 'process_id': '89838'}                                    │
│    run_dir = '/path/to/run/folder/8b/7a/78/8b7a7841-37c7-4446-853b-ad3c00eb5227_1'          │
│ start_time = '2023-12-19 16:44'                                                             │
│      state = 'COMPLETED'                                                                    │
│ updated_on = '2023-12-19 16:44'                                                             │
│       uuid = '8b7a7841-37c7-4446-853b-ad3c00eb5227'                                         │
│     worker = 'local_shell'                                                                  │
╰─────────────────────────────────────────────────────────────────────────────────────────────╯

Note

This will also contain the tracked error in case of failure of the Job. Dealing with failed Jobs will be dealt with in the troubleshooting section.

Flow list#

Similarly to the list of Jobs a list of Flows and their states can be obtained with:

jf flow list

that returns:

                                            Flows info
┏━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ DB id ┃ Name ┃ State     ┃ Flow id                              ┃ Num Jobs ┃ Last updated [CET] ┃
┡━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ 1     │ Flow │ COMPLETED │ 959ffe14-7061-4b74-a3ad-10c3c12715ad │ 2        │ 2023-12-19 16:43   │
└───────┴──────┴───────────┴──────────────────────────────────────┴──────────┴────────────────────┘

Note

A Flow has its own uuid, while the DB id corresponds to the lowest DB id among the Jobs belonging to the Flow

Delete Flows#

In case you need to delete some Flows, without resetting the whole database, you can use the command:

jf flow delete -did 1

where filters similar to the ones of the list command can be used.