schrodinger.ui.sequencealignment.fileio module¶
File I/O handling routines for the multiple sequence viewer.
Copyright Schrodinger, LLC. All rights reserved.
- schrodinger.ui.sequencealignment.fileio.partition_by_predicate(arr, pred)[source]¶
Utility function to groups a list into lists, with each sublist beginning with an element that matches the supplied predicate
Note that many file reading functions below would benefit from using this function.
- Parameters
arr (list) – A list to split into sublists
pred (function) – A function that takes a list item and returns True if the list item meets a criteria and False otherwise
This is not efficient, since we loop through the array twice, but it probably doesn’t matter.
- schrodinger.ui.sequencealignment.fileio.load_fasta_file(sequence_group, file_name, text=None)[source]¶
Load a sequence file in FASTA format, create sequences and append them to the sequence group. Splits sequence name from the FASTA header.
- Parameters
sequence_group (SequenceGroup) – sequence group to which the sequences will be added
file_name (str) – name of input FASTA file
text (list of str) – optional text in FASTA format used instead of the input file, split by newline char into a list of lines
- schrodinger.ui.sequencealignment.fileio.load_PIR_file(sequence_group, file_name)[source]¶
Load a sequence file in PIR format, create sequences and append them to the sequence group.
- Parameters
sequence_group (SequenceGroup) – sequence group to which the sequences will be added
file_name (string) – name of input PIR file
- schrodinger.ui.sequencealignment.fileio.load_GCG_file(sequence_group, file_name)[source]¶
Load a sequence file in GCG format, create sequences and append them to the sequence group.
- Parameters
sequence_group (SequenceGroup) – sequence group to which the sequences will be added
file_name (string) – name of input PIR file
- schrodinger.ui.sequencealignment.fileio.load_EMBL_file(sequence_group, file_name)[source]¶
Load a sequence file in EMBL format, create sequences and append them to the sequence group.
- Parameters
sequence_group (SequenceGroup) – sequence group to which the sequences will be added
file_name (string) – name of input PIR file
- schrodinger.ui.sequencealignment.fileio.load_swissprot_file(sequence_group, file_name, text=None)[source]¶
Load a sequence file in SWISSPROT format, create sequences and append them to the sequence group. Tries to split sequence name from the
- Parameters
sequence_group (SequenceGroup) – sequence group to which the sequences will be added
file_name (string) – name of input SWISSPROT file
text (string) – optional text in SWISSPROT format used instead of the input file
- schrodinger.ui.sequencealignment.fileio.save_fasta_file(sequence_group, file_name, for_clustal=False, file=None, target_sequence=None, skip_gaps=False, save_annotations=False, selected_only=False, start=- 1, end=- 1, as_text=False, save_similarity=False)[source]¶
Writes a contents of sequence group to a file.
- Parameters
sequence_group (SequenceGroup) – Sequence group to be written to a file
file_name (string) – Name of the output file
for_clustal (bool) – Optional parameter indicating if the output file will be used for Clustal alignment
file (file) – Optional file handle, if not None, this handle will be used to write the sequences rather than creating a new file (file_name parameter would be ignored)
target_sequence (
Sequence
) – Optional sequence to be saved. If not specified, all sequences will be written to the output file.skip_gaps (bool) – Optional parameter deciding if gaps should be written to the FASTA file.
save_annotations (bool) – Optional parameter for saving annotations (default is False).
save_similarity (bool) – Saves similarity when set to True.
start (int) – Optional starting position of the subset residues to save.
end (int) – Optional ending position of the subset residues to save.
selected_only (True) – Save only (partially) selected columns if True
- schrodinger.ui.sequencealignment.fileio.save_clustal_file(sequence_group, file_name, file=None, start=- 1, end=- 1, ss_constraints=False, subset=None, ignore_selection=False)[source]¶
Writes a contents of sequence group to a Clustal ALN file.
- Parameters
sequence_group (SequenceGroup) – Sequence group to be written to a file
file_name (string) – Name of the output file
start (int) – Optional starting position of the subset residues to save.
end (int) – Optional ending position of the subset residues to save.
ss_constraints (True) – Optional secondary structure constraints.
- schrodinger.ui.sequencealignment.fileio.load_DND_tree(file_name, sequence_group)[source]¶
Load Newick-formatted tree file outputted by multiple sequence alignment program. The function was tested using outputs of ClustalW and T-Coffee.
- Parameters
file_name (string) – name of the input file
sequence_group (SequenceGroup) – target sequence group
- Returns
True if operation succeeded, False otherwise
- Return type
boolean
- schrodinger.ui.sequencealignment.fileio.parse_DND_string(dndstring, tree)[source]¶
Parse a dnd-formatted string, generate a tree and and append its branches to a given tree.
- Parameters
dndstring (string) – tree in DND format
tree (
TreeNode
) – target tree
- schrodinger.ui.sequencealignment.fileio.load_clustal_file(sequence_group, file_name, replace=False, start=- 1, end=- 1)[source]¶
Load a sequence alignment in Clustal format. Add sequences to a specified sequence group. By default, this method doesn’t replace the old residues, but only introduces gaps according to the alignment. Thus, all residue meta-data (e.g. Maestro information) will be preserved after doing the alignment.
- Parameters
sequence_group (SequenceGroup) – target sequence
file_name (string) – input file name
replace (boolean (default=False)) – optional parameter, if True, replace existing sequences
- Return type
bool
- Returns
True on success, False otherwise
- schrodinger.ui.sequencealignment.fileio.pdb_create_sequence(pdb_id, chain_id, sequence_string)[source]¶
Creates a new sequence out of sequence string read from a PDB file.
- Parameters
pdb_id (str) – PDB ID (4-letter code)
chain_id (str) – single-letter chain ID
sequence_string (str) – single-letter code amino acid string to be converted to the sequence
- Return type
Sequence
- Returns
Created sequence.
- schrodinger.ui.sequencealignment.fileio.load_PDB_file(sequence_group, file_name, requested_chain_id=None, given_pdb_id=None, align_func=None)[source]¶
Reads a PDB file, extracts relevant data and creates the sequence and annotations.
- Parameters
sequence_group (
SequenceGroup
) – target sequence groupfile_name (str) – name of the file to be read
requested_chain_id (str) – Optional parameter. If specified, only the chain ID equal to this parameter will be read.
- Return type
bool
- Returns
True on success, False otherwise.
- schrodinger.ui.sequencealignment.fileio.load_file(sequence_group, file_name, format=None, align_func=None)[source]¶
Loads a file. The file format can be inferred from the file name extension, or can be explictly given.
- Parameters
file_name (string) – input file name
format (string) – format of the input file
- Return type
bool
- Returns
True if file successfully read; otherwise False