schrodinger.application.desmond.queue module

class schrodinger.application.desmond.queue.Queue(hosts: str, max_job: int, max_retries: int, periodic_callback=None)

Bases: object

__init__(hosts: str, max_job: int, max_retries: int, periodic_callback=None)
Parameters
  • hosts – string passed to -HOST.

  • max_job – Maximum number of jobs to run simultaneously.

  • max_retries – Maximum number of times to retry a failed job.

  • periodic_callback – Function to call periodically as the jobs run. This can be used to handle the halt message for stopping a running workflow.

run()

Run jobs for all multisim stages.

Starts a separate JobDJ for each multisim stage.:

queue.push(jobs)
queue.run()
    while jobs:  <---------------|
        jobdj.run()              |
        multisim_jobs.finish()   |
          stage.capture()        |
          next_stage.push()      |
          next_stage.release()   |
          queue.push(next_jobs) --
stop() int

Attempt to stop the subjobs, but kill them if they do not stop in time.

Returns

Number of subjobs killed due to a failure to stop.

push(jobs: List[cmj.Job])
property running_jobs: List[schrodinger.application.desmond.queue.JobAdapter]
class schrodinger.application.desmond.queue.JobAdapter(*args, multisim_job=None, **kwargs)

Bases: schrodinger.job.queue.JobControlJob

__init__(*args, multisim_job=None, **kwargs)

Job constructor.

Parameters
  • command – The command that runs the job.

  • command_dir – The directory from which to run the command.

  • name – The name of the job.

  • max_retries – Number of allowed retries for this job. If this is set, it is never overridden by the SCHRODINGER_MAX_RETRIES environment variable. If it is not set, the value of max_retries defined in JobDJ is used, and SCHRODINGER_MAX_RETRIES can be used to override this value at runtime. To prevent this job from being restarted altogether, set max_retries to zero.

  • timeout – Timeout (in seconds) after which the job will be killed. If None, the job is allowed to run indefinitely.

  • launch_timeout – Timeout (in seconds) for the job launch process to complete. If None, a default timeout will be used for jobserver and old jobcontrol jobs ( see get_default_timeout() ) unless a value for job timeout parameter is passed and is not greater than the default timeout.

  • launch_env_variables – A dictionary with the environment variables to add when the jobcontrol job is launched. The name of any additional variables to set should be in the keyword of the dict and the value should be the corresponding value. These will be added to any environment variables already present, but removed after the job has been launched.

  • kwargs – Additional keyword arguments. Provided for consistency of interface in subclasses.

  • resource_requirement – Whether the job will require special compute resources, such as GPU.

  • license_requirement – List of license tokens required for the job to be used for license checking when SMART_LICENSE_CHECK feature flag is turned on. This is useful for license checking the first job of the smart distribution launched directly to the localhost without canceling from the queue. The license requirements are not known until the job is launched. Each license token is in the form ‘TOKEN’ or ‘TOKEN:n’ where TOKEN is the name of the license, and n is the number of tokens.

  • smart_dist_eligible – Whether this job can be submitted via smart distribution (True) or not (False). This setting only comes into play if all other requirements (such as the resource_requirement, license requirement, number of processors, and smart distribution being turned on) are met. In other words, setting it to True will not force the job to run via smart distribution, but setting it to False will ensure that it does not.

getCommand() List[str]

Return the command used to run this job.

maxFailuresReached(**kwargs)

Print an error summary, including the last 20 lines from each log file in the LogFiles list of the job record.