schrodinger.pipeline.stages.gencodes module

Stage for generating and storing codes for ligands/compounds and variants.

It also recombined input ligands into a number of subjobs specified via -NJOBS argument (If -adjust is used, this number is also adjusted based on the MIN_SUBJOB_STS and MAX_SUBJOB_STS keywords.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.pipeline.stages.gencodes.RecombineStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage for generating compound and variant codes, which are stored as properties.

It also recombined input ligands into a number of subjobs specified via -NJOBS argument (If -adjust is used, this number is also adjusted based on the MIN_SUBJOB_STS and MAX_SUBJOB_STS keywords.

See the specs for the keywords specific to this stage.

A ligand/compound is identified by the UNIQUEFIELD property, and structures with the same value of this property are indexed as variants. If UNIQUEFIELD is not specified, each structure is treated as a compound (i.e., there are no variants), and the label for the compound is just the structure index number as all the files and structures are read.

The output structures will have their OUTCOMPOUNDFIELD set to their compound label (either the UNIQUEFIELD property value or the structure index). If OUTCOMPOUNDFIELD is “s_m_title”, the original titles will be saved to the ‘s_pipeline_title_backup’ property.

OUTVARIANTFIELD property (‘s_pipeline_variant’ by default) is set to “<compound_label>-<variant_index>”.

The stage takes one input structure file set and generates one set of output structure files, each containing about 100,000 structures.

__init__(*args, **kwargs)

See class docstring.

operate()

Generates ligand compound and variant codes, and saves the structures into fewer number of files (max 100,000 per file, except a new output file must start with a new compound). Adds the compound label, compound-and-variant, and original title to the output structures as properties. Raises a RuntimeError if the output file format is invalid, if there is a problem reading an input file or writing an output file, or if the UNQIUEFIELD property doesn’t exist for a structure.

class schrodinger.pipeline.stages.gencodes.RestoreTitlesStage(*args, **kwargs)

Bases: schrodinger.pipeline.stage.Stage

Stage for copying titles for each input ligand from the ‘s_pipeline_title_backup’ property to the ‘s_m_title’ property.

The stage takes one input structure file set and generates one set of output structure files, each containing about 100,000 structures.

__init__(*args, **kwargs)

See class docstring.

operate()

For each structure, sets the value of the DESTINATION_FIELD to the the value of the SOURCE_FIELD. If no such value exists, raises a RuntimeError.