schrodinger.application.phase.packages.phase_screen_driver_utils module

Module with functionality used by phase_screen_driver.py.

Copyright Schrodinger LLC, All Rights Reserved.

class schrodinger.application.phase.packages.phase_screen_driver_utils.SourceFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

file = 1
database = 2
project = 3
schrodinger.application.phase.packages.phase_screen_driver_utils.add_hidden_options(parser)

Adds options that the user doesn’t need to know about.

Parameters

parser (argparser.ArgumentParser) – Argument parser object.

schrodinger.application.phase.packages.phase_screen_driver_utils.add_jobcontrol_options(parser)

Adds job control options to the provided parser.

Parameters

parser (argparser.ArgumentParser) – Argument parser object.

schrodinger.application.phase.packages.phase_screen_driver_utils.add_database_options(parser)

Adds database screening options to the provided parser.

Parameters

parser (argparser.ArgumentParser) – Argument parser object

schrodinger.application.phase.packages.phase_screen_driver_utils.add_matching_options(parser)

Adds matching options to the provided parser.

Parameters

parser (argparser.ArgumentParser) – Argument parser object.

schrodinger.application.phase.packages.phase_screen_driver_utils.add_reporting_options(parser)

Adds reporting options to the provided parser.

Parameters

parser (argparser.ArgumentParser) – Argument parser object

schrodinger.application.phase.packages.phase_screen_driver_utils.add_scoring_options(parser)

Adds scoring/filtering options to the provided parser.

Parameters

parser (argparser.ArgumentParser) – Argument parser object.

schrodinger.application.phase.packages.phase_screen_driver_utils.combine_hit_files(args, subjobs)

Combines hit files for the supplied subjobs.

Parameters
  • args (argparse.Namespace) – Command line arguments

  • subjobs (list(str)) – Subjob names

schrodinger.application.phase.packages.phase_screen_driver_utils.distribute_hypos(hypos, num_zip_files, jobname)

Distributes the supplied hypotheses equally over the indicated number of zip files and returns the names of those zip files.

Parameters
  • hypos (list(PhpHypoAdaptor)) – Hypotheses

  • num_zip_files (int) – Number of zip files to create

  • jobname (str) – Job name

Returns

Names of zip files

Return type

list(str)

schrodinger.application.phase.packages.phase_screen_driver_utils.get_common_args(args)

Returns a command containing arguments that are common to all subjobs.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

Command with common arguments

Return type

list(str)

schrodinger.application.phase.packages.phase_screen_driver_utils.get_hypos(hypo_file)

Reads hypothesis or hypotheses from a .phypo or .zip file.

Parameters

hypo_file – A .phypo or .zip file

Returns

list of one or more hypotheses

Return type

list(PhpHypoAdaptor)

schrodinger.application.phase.packages.phase_screen_driver_utils.get_min_sites(hypo, user_match)

Returns the minimum number of sites that must be matched in the supplied hypothesis. This may come from user_match or from the PHASE_MIN_SITES property in the hypothesis. If neither is specified, it will be the total number of sites in the hypothesis.

Parameters
  • hypo (PhpHypoAdaptor) – pharmacophore hypothesis

  • user_match – User-specified minimum number of sites or None

Returns

Minimum number of sites to match

Return type

int

schrodinger.application.phase.packages.phase_screen_driver_utils.get_num_subjobs(args)

Returns the number of subjobs requested on the command line via the -NJOBS or -HOST option.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

Number of subjobs

Return type

int

schrodinger.application.phase.packages.phase_screen_driver_utils.get_parser()

Creates argparse.ArgumentParser with supported command line options.

Returns

Argument parser object

Return type

argparser.ArgumentParser

schrodinger.application.phase.packages.phase_screen_driver_utils.get_source_files(source)

Returns the names of the files/databases/zipped projects to be screened, taking proper account of whether the current process is running under job control.

Parameters

source (str) – A legal source of structures to screen

Returns

Names of files/database/zipped projects to screen

Return type

list(str)

schrodinger.application.phase.packages.phase_screen_driver_utils.get_source_format(source)

Returns the format of source as a SourceFormat object.

param source: The name of a file, database or zipped project type source: str

Returns

The format of source

Return type

SourceFormat

schrodinger.application.phase.packages.phase_screen_driver_utils.prepend_hypos(args)

Prepends pharmacophore hypotheses to the hit file.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

schrodinger.application.phase.packages.phase_screen_driver_utils.remove_output_files(args)

Removes output files that would be created in the launch directory by the parent job.

Parameters

args (argparse.Namespace) – Command line arguments

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_db_screen(args, db_paths)

Does setup for a distributed database screen.

Parameters
  • args (argparser.Namespace) – argparser.Namespace with command line options

  • db_paths (list(str)) – Databases to screen

Returns

list of subjob commands

Return type

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_distributed_screen(args)

Does all the setup required to launch distributed subjobs. This includes splitting input files or database subsets, and creation of the files <subjob>_inputs.list, which contain the names of the input files for each subjob. Returns a list of subjob commands that can be supplied directly to JobDJ.addJob. The number of commands may be larger than the number CPUs requested if the -NJOBS option is used to divide the work over a larger number of work units. Conversely, the number of commands may be smaller than requested if the provided source(s) of structures cannot be subdivided as requested (e.g., 2 multi-conformer files cannot be split over more than 2 subjobs).

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

list of subjob commands

Return type

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_fixed_file_screen(args, file_names)

Does setup for a distributed file screen where multiple conformers per molecule are present and thus the files cannot be split. Note that the maximum number of subjobs will not exceed the number of input files, and the load balancing may be less than optimal if the input files differ significantly in their numbers of molecules and/or conformers.

Parameters
  • args (argparser.Namespace) – argparser.Namespace with command line options

  • file_names (list(str)) – Files to screen with runtime paths

Returns

list of subjob commands

Return type

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_project_screen(args, project_names)

Does setup for a distributed screen of zipped projects. This workflow is used only by phase_find_common, where a project of actives and a project of decoys are screened against the top-n pharmacophore hypotheses found by the common pharmacophore algorithm. Because we can’t unzip a project and hope that its database lands on a cross-mounted disk, we can’t readily divide the record numbers of the project database over multiple subjobs, as we do for a standard database screen. The most practical approach is to divide the hypotheses equally over the subjobs and have each subjob screen its own local copies of the unzipped project databases.

Parameters
  • args (argparser.Namespace) – argparser.Namespace with command line options

  • project_names (list(str)) – Zipped projects to screen with runtime paths

Returns

list of subjob commands

Return type

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_split_file_screen(args, file_names)

Does setup for a distributed file screen with splitting of the input files so that each subjob receives a single file with approximately the same number of structures as the other subjobs.

Parameters
  • args (argparser.Namespace) – argparser.Namespace with command line options

  • file_names (list(str)) – Files to screen with runtime paths

Returns

list of subjob commands

Return type

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.validate_args(args)

Checks the validity of command line options.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

tuple of validity and error message if not valid

Return type

bool, str

schrodinger.application.phase.packages.phase_screen_driver_utils.validate_dbsites(args)

Checks the legality of the -dbsites option w.r.t. to all databases and hypotheses. Should be called only after job is running on remote host.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

tuple of validity and error message if invalid

Return type

bool, str

schrodinger.application.phase.packages.phase_screen_driver_utils.validate_hypo(args)

Checks the validity of the hypothesis or hypotheses.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

tuple of validity and error message if not valid

Return type

bool, str

schrodinger.application.phase.packages.phase_screen_driver_utils.validate_source(args)

Checks the validity of the source of structures to screen and the validity of the command line options w.r.t. the source type.

Parameters

args (argparser.Namespace) – argparser.Namespace with command line options

Returns

tuple of validity and error message if not valid

Return type

bool, str