schrodinger.pipeline.stages.filtering module

Stages related to filtering structures.

Main stage uses $SCHRODINGER/utilities/ligfilter.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.pipeline.stages.filtering.LigFilterStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage interface for the LigFilter utility.

The keywords specific to this stage are…

FILTER_FILE File name for a Ligfilter criteria file.

CONDITIONS String list of Ligfilter criteria. Ignored

if a FILTER_FILE is specified.

The stage takes one set of input structure files and generates one set of corresponding output files.

__init__(*args, **kwargs)

See class docstring.

setupJobs()

Sets up LigFilter jobs, which are distributed via JobDJ. There will be one Ligfilter job for each input file. This method clears the working directory of previous output and log files (if any). Raises a RuntimeError if there is no FILTER_FILE and no CONDITIONS.

processJobOutputs()

Reports the number of structures that passed the stage. Raises a RuntimeError if any Ligfilter .log file is missing the summary information.

operate()

Perform an operation on the input files. There are setup, running, and post-processing steps, and the stage records its current status so that it can be restarted in that step if there is a failure. Raises a RuntimeError if the JobDJ run() method fails, or if the stage finishes with an improper status.

class schrodinger.pipeline.stages.filtering.SubSetStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage for making a subset of the input files based on a list of titles or other property.

The keywords specific to this stage are:

PROPERTY Property to filter on (default s_m_title) VALUE_FILE File containing a list of values to keep (one per line) if a FILTER_FILE is specified.

The stage takes one set of input structure files and generates one set of corresponding output files.

NOTE: SMILES format files can only be filtered on the title.

__init__(*args, **kwargs)

See class docstring.

filterSmilesFile(ligfile, outfile)

Filter the file <ligfile> by title based on VALUE_FILE and add the output file to self.output_files list

filterFile(ligfile, outfile)

Filter the file <ligfile> based on PROPERTY & VALUE_FILE and add the output file to self.output_files list

readValueList()
operate()

Perform the filtering operation on the input files.

class schrodinger.pipeline.stages.filtering.DrugLikeSplitStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage for splitting the input ligand structures into 3 groups:

Drug-like ligands: -dl.maegz Course ligands: -co.maegz Left-over ligands: -lo.maegz

Based on the specified criteria.

If at least one variant of a root matches all criteria for a drug-like ligand, then all variants of that root are included in the dl.maegz output file.

If none match, then if at least one variant matches all criteria for coarse ligand, then all variants are included in the co.maegz file.

Otherwise all variants are included in the lo.maegz file.

All variants for the same root must be listed in blocks (next to each other) in the input file and have the same title.

Also labels variants by setting s_vsw_variant field to <title>-# Where # is a variant number (1 to n).

__init__(*args, **kwargs)

This is the Stage class. Derive your own class from it.

Parameters
  • stagename – full name for this stage (<jobname>-<stagename>)

  • specs – ConfigObj specification for the supported keywords

  • allow_extra_keywords – Whether to allow keywords that are not in the specification.

  • cleanup – Whether to remove intermediate files

  • inpipeline – Whether the state is running within a Python Pipeline. If the stage is manually created, do NOT set this flag. Python Pipeline will set it as needed.

  • driver_dir – Directory in which the driver is running.

operate()

Perform an operation on the input files.

qualify(st)

Returns module constant classification for the structure: Drug-like, coarse, etc.

writeRoot(root_sts, current_root, current_root_qualification)
class schrodinger.pipeline.stages.filtering.MergeDuplicatesStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage for removing duplicate variants within each root (compound) based on unique SMILES strings.

NOTE: All variants of each root must be in consecutive order in the input.

__init__(*args, **kwargs)

This is the Stage class. Derive your own class from it.

Parameters
  • stagename – full name for this stage (<jobname>-<stagename>)

  • specs – ConfigObj specification for the supported keywords

  • allow_extra_keywords – Whether to allow keywords that are not in the specification.

  • cleanup – Whether to remove intermediate files

  • inpipeline – Whether the state is running within a Python Pipeline. If the stage is manually created, do NOT set this flag. Python Pipeline will set it as needed.

  • driver_dir – Directory in which the driver is running.

operate()

Perform an operation on the input files.

class schrodinger.pipeline.stages.filtering.GeneratePropsStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Ev:85070

For each structure in the input set, runs: $SCHRODINGER/run generate_ligfilter_properties.py, which generates properties specified via the ligfilter.predefined_function_dict.

__init__(*args, **kwargs)

See class docstring.

setupJobs()

Sets up subjobs, which are distributed via JobDJ. There will be one subjob per each input file.

processJobOutputs()

Analyzes the output files from the subjobs. Raises a RuntimeError if there are any errors with the subjobs.

operate()

Perform an operation on the input files. There are setup, running, and post-processing steps, and the stage records its current status so that it can be restarted in that step if there is a failure. Raises a RuntimeError if the JobDJ run() method fails, or if the stage finishes with an improper status.

class schrodinger.pipeline.stages.filtering.ChargeFilterStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage-based class for filtering a set of structure files by total charge.

MIN_CHARGE and MAX_CHARGE are the two keywords specific to this stage. If a structure has a total charge within [MIN_CHARGE,MAX_CHARGE] (inclusive), it is retained; otherwise, the structure is filtered out.

The stage takes one input structure file set and generates one set of corresponding output structure files.

__init__(*args, **kwargs)

See class docstring.

operate()

Read all the structures in the input files. If a structure’s total charge is between MIN_CHARGE and MAX_CHARGE, write it to a the corresponding output file. Raises an IOError if there is a problem reading an input file or writing an output file, and raises a SystemExit if there are no output structures.