schrodinger.ui.sequencealignment.sequence_group module

Implementation of SequenceGroup class.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.ui.sequencealignment.sequence_group.SequenceGroup[source]

Bases: object

This class is a container for the sequences displayed in the sequence viewer. The class performs various operations on sets of sequences or on the entire group. After the group is modified, SequenceViewer update() method should be called.

__init__()[source]
build_mode

Homology modeling mode.

identity_in_columns

Calculate sequence identity in selected columns only

copyForUndo(attributes_only=True)[source]
clear()[source]

Deletes the entire contents of self.

addSequence(sequence)[source]

Adds a new sequence to self.

hasSelectedResidues()[source]

Check if the group has any residues selected.

Return type

bool

Returns

True if there are any residues selected within the group, False otherwise

countSelectedResidues()[source]

Counts selected residues.

Return type

int

Returns

Number of selected residues.

hasSelectedSequences(exclude_reference=False, check_children=False)[source]

Check if there are any selected sequences within the group.

Parameters

check_children (bool (default=False)) – optional parameter, if True, the function will also check child sequences.

Return type

bool

Returns

True if there are any sequences selected in the group, False otherwise

selectColumns(start, end, select=True)[source]

Select all columns within alignment positon range from start to end.

Parameters
  • start (int) – first column to be selected

  • end (int) – last column to be selected

unselectAll(make_active=False)[source]

Unselect all residues within the entire group.

anchorSelection()[source]
clearAnchors()[source]
hideSelected()[source]

Hide all selected sequences (including children).

hideSequences(sequences)[source]

Hide all sequences passed in list.

Parameters

sequences (list of sequences) – sequences to hide

showSequences(sequences)[source]

Show all sequences passed in list.

Parameters

sequences (list of sequences) – sequences to show

showAll()[source]

Makes all sequences within the group visible.

deleteSelected()[source]

Deletes all selected sequences. :note: tree will be removed.

updateReference(ignore_maestro=False)[source]

This function updates the reference sequence.

fillGaps()[source]

Replaces all selected residues in visible sequences with gaps.

removeAllGaps()[source]

Removes all gaps from the visible sequences.

deleteSelectedResidues()[source]

Deletes selected residues from the visible sequences.

isColumnEmpty(position)[source]

Checks if a column at given position is empty.

minimizeAlignment(query_only=False)[source]

Minimizes the alignment, i.e. removes all gaps from the gap-only columns.

lockGaps()[source]

Locks gaps in visible sequences.

unlockGaps()[source]

Unlocks gaps in visible sequences.

selectAlignedBlocks(selected_only=False)[source]

Selects aligned blocks (regions without gaps).

selectStructureBlocks(selected_only=False)[source]

Selects aligned blocks (regions without gaps).

selectIdentities()[source]

Selects identical residues in the alignment, ignores gaps.

selectResidues(start_row, start_pos, end_row, end_pos, select)[source]

Selects residues within a specified region.

Parameters
  • start_row ((Sequence, int, int, int)) – initial row

  • start_pos (int) – initial position

  • end_row ((Sequence, int, int, int)) – final row

  • end_pos (int) – final position

grabAndDrag(start_row, start_pos, end_row, end_pos, lock_down, gap_insert_mode=False, slide_sequence=False, block_length=0)[source]

This method performs a grab-and-drag operation on a specified region.

Parameters
  • start_row ((Sequence, int, int, int)) – initial row

  • start_pos (int) – initial position

  • end_row ((Sequence, int, int, int)) – final row

  • end_pos (int) – final position

  • lock_down (bool) – indicates if the sequence downstream should be locked

tmpLockGaps(row, start_pos, end_pos)[source]

Temporarily locks gaps in a specified region.

Parameters
  • row ((Sequence, int, int, int)) – sequence viewer row

  • start_pos (int) – initial position

  • end_pos (int) – final position

tmpUnlockGaps(row, start_pos, end_pos)[source]

Unlocks temporarily locked gaps.

Parameters
  • row ((Sequence, int, int, int)) – sequence viewer row

  • start_pos (int) – initial position

  • end_pos (int) – final position

grabAndDragBlock(rows, start_row, start_pos, end_row, end_pos, block_length, lock_down)[source]

This method performs grab-and-drag operation on selected blocks in “Select And Slide” mode.

Parameters
  • rows (list of (Sequence, int, int, int)) – sequence viewer rows

  • start_row ((Sequence, int, int, int)) – initial row

  • start_pos (int) – initial position

  • end_row ((Sequence, int, int, int)) – final row

  • end_pos (int) – final position

  • lock_down (bool) – indicates if the sequence downstream should be locked

insertGap(start_row, start_pos)[source]

Inserts a single gap at a specified position.

Parameters
  • start_row ((Sequence, int, int, int)) – sequence viewer row

  • start_pos (int) – position where the gap will be inserted

removeGap(start_row, start_pos)[source]

Removes a single gap at a specified position.

Parameters
  • start_row ((Sequence, int, int, int)) – sequence viewer row

  • start_pos (int) – position where the gap will be removed

colorBySequenceSimilarity()[source]

This method colors the sequences by sequence similarity.

colorByIdentity()[source]

This method colors the sequences by identity.

colorByDifference()[source]

This method colors the sequences by difference.

calculateProfile(ignore_query=False)[source]

Calculates global sequence profile and associated information.

Parameters

ignore_query (bool) – Tells if query sequence should be included in the calculation or it should be ignored.

getConsensus()[source]

Returns a consensus sequence.

updateConsensus(consensus)[source]

Updates the consensus sequence using a pre-calculated profile.

Parameters

consensus (Sequence) – consensus sequence to be updated

updateSymbols(symbols)[source]

Updates the consensus symbols string using a pre-calculated profile.

Parameters

symbols (Sequence) – consensus symbols string to be updated

updateMeanHydrophobicity(seq)[source]

Updates mean hydrophobicity global annotation sequence.

Parameters

seq (Sequence) – mean hydrophobicity sequence

updateMeanPI(seq)[source]

Updates mean isoelectric point global annotation sequence.

Parameters

seq (Sequence) – mean hydrophobicity sequence

colorByTaylorScheme()[source]

Colors all amino acid sequences using Taylor scheme.

colorAverageColumnColors()[source]

Averages colors in columns.

Note

This feature works particularly well with the Taylor scheme.

colorWeightByAlignmentStrength(min_weight_identity, max_weight_identity)[source]

Weights the sequences by alignment stregth.

findMaxLength(exclude_consensus=False)[source]

Finds a length of the longest sequence.

Parameters

exclude_consensus (bool) – if True, the consensus sequence will not count

Return type

int

Returns

maximum sequence length in the entire group

unselectAllSequences()[source]

Unselects all sequences in the group.

selectAllSequences()[source]

Selects all sequences in the group. Child selection remains unchanged.

selectAll()[source]

Selects all residues.

deselectAll()[source]

Deselects all residues.

invertSequenceSelection()[source]

Inverts sequence selection range.

invertSelection()[source]

Inverts residue selection range.

deleteAnnotations()[source]

Removes all annotations.

deleteGlobalAnnotations()[source]

Removes all global annotations.

deletePredictions()[source]

Removes all predictions.

addSeparator()[source]

Adds a separator sequence.

addRuler(update_only=False)[source]

Adds a ruler sequence.

Return type

Sequence

Returns

A ruler sequence.

removeRuler()[source]

Removes a ruler sequence.

addConsensusSequence(toggle=False)[source]

Creates a consensus sequence and adds it to the group. If the annotation already exists, just expands the sequence and makes it visible.

Return type

Sequence

Returns

consensus sequence

addConsensusSymbols(toggle=False)[source]

Creates a consensus symbols string and adds it to the group. If the annotation already exists, just expands the sequence and makes it visible.

Return type

Sequence

Returns

consensus sequence

addMeanHydrophobicity(toggle=False)[source]

Creates a mean hydrophobicity annotation and adds it to the group. If the annotation already exists, just expands the sequence and makes it visible.

Return type

Sequence

Returns

mean hydrophbicity annotation

addMeanPI(toggle=False)[source]

Creates a mean isoelectric point annotation and adds it to the group. If the annotation already exists, just expands the sequence and makes it visible.

Return type

Sequence

Returns

mean hydrophbicity annotation

Creates a sequence logo annotation and adds it to the group. If the annotation already exists, just expands the sequence and makes it visible.

Return type

Sequence

Returns

sequence logo annotation object

colorByResidueType()[source]
colorByMaestro()[source]

Colors the sequences by Maestro colors.

colorByKyteDoolittle()[source]

Colors the sequences by Kyte-Doolittle hydrophobicity scale.

colorByHoppWoods()[source]

Colors the sequences by Hopp-Woods hydrophilicity scale.

colorByGray()[source]

Colors the sequences using plain gray color. This scheme is useful in combination with “color by alignment strength.”

colorByWhite()[source]

Colors the sequences using white color.

colorByCustomColor(color)[source]

Colors the sequences using custom color.

colorByCustomAnnotations()[source]

Colors the sequences using custom annotation color.

colorByBfactor()[source]

Colors the sequences using Bfactor values.

colorByPosition()[source]

Colors the sequences using residue position.

colorBySecondary()[source]

Colors the sequences using secondary structure annotation. If SSA is available, if will be used for coloring, otherwise a consensus of available SSP annotations will be used here.

colorByColorBlocks(scheme)[source]

Colors the sequences using plain gray color. This scheme is useful in combination with “color by alignment strength.”

colorSequences(mode=None, color=None)[source]

Colors the sequences using a specified mode.

Parameters

mode (int) – coloring mode

hideAllChildren()[source]

Hides all children, effectively collapsing the sequences.

showAllChildren()[source]

Shows all children, effectively expanding the sequences.

updateVariableSequences()[source]

Updates global variable sequences, i.e. consensus plot and “mean” annotations.

padAlignment()[source]

Pads the alignment with additional gaps, so all sequences have identical length equal to the length of the longest sequence.

removeTerminalGaps()[source]

Removes terminal gaps from the alignment.

encode(seq)[source]

Encodes the sequence to include annotation data required by pattern search. Return an ungapped version of the pattern.

findPattern(pattern)[source]

Finds a specified PROSITE pattern in all sequences.

sortKeyName(sequence)[source]

Returns sequence name sort key. :rtype: string :return: sequence name sort key

sortKeyChain(sequence)[source]

Returns sequence chain ID sort key. :rtype: string :return: sequence chain ID sort key

sortKeyLength(sequence)[source]

Returns sequence length sort key. :rtype: int :return: sequence length sort key

sortKeyGaps(sequence)[source]

Returns sequence number of gaps sort key. :rtype: int :return: sequence number of gaps sort key

sortKeyIdentity(sequence)[source]

Returns sequence identity with consesus sequence sort key. :rtype: float :return: sequence identity with consesus sequence sort key

sortKeyHomology(sequence)[source]

Returns sequence homology with reference sequence sort key. :rtype: float :return: sequence homology with reference sequence sort key

sortKeySimilarity(sequence)[source]

Returns sequence similarity to the consesus sequence as a sort key. :rtype: float :return: sequence similarity to the consesus sequence sort key

sortKeyScore(sequence)[source]

Returns sequence score sort key. :rtype: float :return: sequence score sort key

getSortableSequences(ignore_reference=True)[source]

Returns a list of “sortable” sequences, i.e. all parent sequences that are not auxiliary objects and not global annotations.

Type

ignore_reference: boolean

Param

Normally, the reference is not considered a sortable sequence, unless ignore_reference is set to False. In such case the reference will be included in the sortable group.

Return type

list of Sequence

Returns

list of sortable sequences

replaceSortableSequences(sequence_list, ignore_reference=True)[source]

This method replaces all “sortable” sequences with a list of sorted sequences. The length of the given list is supposed to have a number of items equal to the number of sortable sequences. This method is used to replace sequences in the group after sorting operation.

Parameters

sequence_list (list of Sequence) – replacement list of sequences

Type

ignore_reference: boolean

Param

Normally, the reference is not considered a sortable sequence, unless ignore_reference is set to False. In such case the reference will be included in the sortable group.

sort(order, reverse_order=False)[source]

This method sorts the sequences according to a specified order. Optionally, the sequences can be sorted in a reverse order.

Parameters
  • order (int) – sort order

  • reverse_order (bool) – if True, the sequences will be reverse sorted (default=False, i.e. sorting from smallest to largest key)

moveUp()[source]

Moves selected sequences to the top of the group.

moveDown()[source]

Moves selected sequences to the bottom of the group.

moveTop(target=None)[source]

Moves selected sequences to the top of the group.

If target is specified, move only the target sequence to top.

moveBottom()[source]

Moves selected sequences to the bottom of the group.

sortByTreeOrder()[source]

Sorts the sequences by the tree order. If there is no tree, returns False.

Returns

True on success, False if valid tree doesn’t exist.

Return type

boolean

duplicateSelectedSequences()[source]

Duplicates selected sequences.

getGlobalAnnotation(annotation_type)[source]
addAnnotation(annotation_type, remove=False)[source]

Adds an annotation sequence to selected sequences or to all sequences if no sequence is selected.

Parameters

annotation_type (int) – type of the annotation sequence

addCustomAnnotation(sequence=None, title='Custom Annotation', name='', region_list=[])[source]

This function adds or updates a custom annotation in the specified sequence.

Parameters

region_list – List of residue regions. Each item should include: (first_res_id, last_res_id, label, color)

selectRegions(sequence, region_list)[source]

Select residue subsets based on a region list.

selectRedundantSequences(value, columns=False, reference=None)[source]

Selects sequences below a specified identity threshold value.

clearConstraints()[source]
removeConstraints()[source]
hasConstraints()[source]
addConstraint(seq1, pos1, seq2, pos2, for_prime=False)[source]

Adds a pairwise alignment constraint. At least one of the specified sequences has to be a reference sequence.

Return type

boolean

Returns

True if the constrait was successfully added, False otherwise

statistics()[source]
setReference(ref=None)[source]

Sets a reference sequence.

removeMaestroSequences()[source]

Removes all Maestro sequences.

removeLonelyRuler()[source]

Remove ruler but only if there is nothing else left.

Return type

bool

Returns

True if the ruler was removed

toggleHistory()[source]

Adds or removes a sequence modification history string.

resetHistory()[source]

This function resets change tracking string.

updateHistory(start, end)[source]

This function updates an internal history string.

Return type

bool

Returns

True if the history was updated, False otherwise.

setConsiderGaps(value)[source]

Sets value of consider gaps flag. If set to True, gaps will be included in calculation of local sequence similarity measures.

Parameters

value (bool) – Should we consider gaps for sequence identity calculations.

updateSSA(remove=False)[source]

Updates secondary structure assignments for all structure-coupled sequences.

expandSelection()[source]

Expands selection to include entire columns.

expandSelectionRef()[source]

Expands selection from the reference sequence to include entire columns.

hideColumns(unselected=False)[source]

Hides selected or un-selected columns.

isColumnSelected(pos, weak=False)[source]

Returns True if the specified column is selected, False otherwise.

Parameters

weak (bool) – If weak is False (default) treat the column as selected only if all residues in the column are selected.

hideColumn(pos)[source]

Hides the specified column.

showAllResidues()[source]

Makes all residues in all sequences visible.

markResidues(rgb)[source]
clearMarkedResidues()[source]
cropSelectedResidues()[source]
markTemplateRegion()[source]
premarkTemplateRegion()[source]
unmarkTemplateRegions()[source]
getStructureList(omit_reference=False)[source]

Returns a list of visible sequences associated with structures.

getMaestroSequencesForEntryId(entry_id)[source]
colorSequenceNames(color)[source]
selectFirstTemplate(n_templates=1)[source]

Selects first available valid template from a template list. If a template is alread selected, do nothing. Optionally, select n_templates valid templates instead of just the first one.

hasValidTemplates()[source]
getTemplates(selected_only=False)[source]

Return a list of sequences<schrodinger.ui.sequencealignment.sequence.Sequence that are valid templates.

copySequences(group)[source]
calculateMatrix()[source]

Calculates a substitution matrix based on the current alignment.

alignByResidueNumbers()[source]

Aligns the sequences so that identical residue numbers are lined up in columns.

addCustomSequence()[source]
getSequenceNameList()[source]

Returns a list of sequence names.

repair()[source]

Repairs the group by setting sequence-residue associations for all sequences in group. Also, adds missing attributes (using default values) to the group.

Useful if the group comes from a corrupted project file.