Using the QueueManager#
The QueueManager is a full implementation of a queue manager that provides a high-level API to submit, monitor, and cancel jobs.
Initialization#
A QueueManager requires a SchedulerIO instance. By default, it uses a LocalHost to execute commands on the machine where it is running.
from qtoolkit.manager import QueueManager
from qtoolkit.io.slurm import SlurmIO
from qtoolkit.host.local import LocalHost
slurm = SlurmIO()
host = LocalHost()
manager = QueueManager(scheduler_io=slurm, host=host)
Submitting Jobs#
The submit method handles the complete job submission process:
Generating a submission script.
Writing the script to a specified directory.
Executing the submission command.
Parsing the output to retrieve the job ID.
commands = ["module load python", "python script.py"]
result = manager.submit(
commands=commands,
work_dir="path/to/workdir",
script_fname="submit.sh",
create_submit_dir=True
)
Environment Configuration#
You can pass an environment dictionary to submit to configure the execution environment:
env = {
"modules": ["intel/2021", "python/3.9"],
"source_files": ["/path/to/env_vars.sh"],
"conda_environment": "my_env",
"environ": {"OMP_NUM_THREADS": "4"}
}
result = manager.submit(commands=commands, environment=env)
This will add the following lines to the submission script:
module purge
module load intel/2021
module load python/3.9
source /path/to/env_vars.sh
conda activate my_env
export OMP_NUM_THREADS=4
Managing Jobs#
Retrieving Job Information#
Use get_job to get a QJob object for a specific job:
job = manager.get_job("12345")
if job:
print(f"State: {job.state}")
Listing Jobs#
Use get_jobs_list to retrieve a list of jobs:
# List all jobs
jobs = manager.get_jobs_list()
# List specific jobs
jobs = manager.get_jobs_list(jobs=["12345", "12346"])
# List jobs for a specific user
jobs = manager.get_jobs_list(user="username")
Cancelling Jobs#
Use cancel to terminate a job:
result = manager.cancel("12345")
if result.status.value == "SUCCESSFUL":
print("Job cancelled.")