schrodinger.protein.tasks.clustal module

schrodinger.protein.tasks.clustal.get_clustal_path()

Returns a path to ClustalW executable.

This function attempts to find clustalw2 excutable in following locations:

  1. Maestro bin directory based on MAESTRO_EXEC env var.

  2. Maestro bin directory based on $SCHRODINGER/maestro-v*/bin/* path.

  3. User-defined location (CLUSTALW2 env var).

Return type

str

Returns

path to ClustalW executable file, or None if the executable could not be located.

class schrodinger.protein.tasks.clustal.AbstractAlignmentJob(aln, second_aln=None)

Bases: PyQt6.QtCore.QObject

Abstract class for defining common alignment job behavior

Cvar

progressMade: A signal emitted with the number of lines output by the clustal job.

progressMade

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

__init__(aln, second_aln=None)
Parameters
setGapPenalties(gapopen, gapext)
run()

Should be implemented by subclasses.

cancel()
class schrodinger.protein.tasks.clustal.ClustalJob(aln, second_aln=None, profile_mode='profile', matrix=None, gapopen=None, gapext=None, quicktree=True, output_fname=None, clustering=None)

Bases: schrodinger.protein.tasks.clustal.AbstractAlignmentJob

Class to run a clustal job:

__init__(aln, second_aln=None, profile_mode='profile', matrix=None, gapopen=None, gapext=None, quicktree=True, output_fname=None, clustering=None)
This class can use one of three available alignment modes:
  • regular multiple sequence alignment,

  • profile-profile alignment where two alignments are aligned to each other, but both alignments remain unchanged,

  • profile-sequence alignment where several sequences are iteratively aligned to existing alignment.

Parameters
  • aln (ProteinAlignment) – Input sequence alignment.

  • second_aln (ProteinAlignment) – Second alignment for profile-profile and profile-sequence alignment.

  • profile_mode (str) – Determines profile alignment mode. Can be “profile” for profile-profile alignment, or “sequences” for profile-sequence alignment.

  • matrix (str or None) – substitution matrix family (“BLOSUM”, “PAM”, “GONNET”, “ID”) If None, default matrix (GONNET) is used.

  • gapopen (float or None) – Gap opening penalty. If None, default value is used.

  • gapext (float or None) – Gap extension penalty. If None, default value is used.

  • quicktree (bool) – Use fast algorithm for building guide tree.

  • output_fname (str or None) – Path of file to save clustalw2 std output to. If None, output is not saved.

run()

Run the clustal job.

Raises

RuntimeError if no clustal executable can be found

Returns

Output alignment. The sequences are output in the same order as input. Sequence attributes are preserved. The tree is in Newick format. This function returns None if the job is canceled.

Return type

ProteinAlignment or None