schrodinger.application.bioluminate.epitope.psc_workflow module¶
- schrodinger.application.bioluminate.epitope.psc_workflow.save_mkdir(odir, overwrite)¶
- schrodinger.application.bioluminate.epitope.psc_workflow.gen_fastas(spreadsheet_file=None, output_dir=None, overwrite=False)¶
Read an Excel spreadsheet that contains at least 3 columns with the column names, ‘ID’, ‘VH’ and ‘VL’, indicating the names/labels of the antibodies, heavy chain and light chain sequences, respectively.
- class schrodinger.application.bioluminate.epitope.psc_workflow.psc_workflow(project_name, fastas_dir, overwrite_data=False)¶
Bases:
object
Class that creates an paratope similarity clustering workflow object
- util_scripts_dir = './utils/'¶
- numbering_scheme = 'EnhancedChothia'¶
- nmodels = 1¶
- S = None¶
- D = None¶
- D1d = None¶
- L = None¶
- N_fab = None¶
- ID_fab = None¶
- plot_width = 7¶
- plot_height = 6¶
- dpi = 300¶
- __init__(project_name, fastas_dir, overwrite_data=False)¶
- fastas_dir = None¶
- project_name = None¶
- overwrite_data = None¶
- data_dir = None¶
- fab_dir = None¶
- build_ab_log_dir = None¶
- faux_epi_dir = None¶
- mif_dir = None¶
- safe_mkdir(dir)¶
- gen_file_list(dir)¶
- gen_names(dir)¶
- gen_jobname(filename)¶
Remove ‘.fasta’ from the fastas file name and use the base for the jobname.
- save_mat_csv(mat, IDs, mat_name)¶
- build_ab(fasta_file, jobname)¶
Given a fasta file containing the sequences of Vh and Vl domains, create an antibody model.
- gen_faux_epitopes()¶
Generate faux epitopes from Fabs Maestro files in the fab_dir
- gen_mifs()¶
Generate MIFs
- gen_sim_mat()¶
Compute the pairwise similarity matrix between binding sites using Phase Shape approach. Compare all i,j pairs of MIFs in the mif_dir.
- sim2dist()¶
Convert the simularity matrix S in to a symmetric, normalizd distance matrix D
- cond_1d_dist_mat(triag='upper')¶
Convert the fully symmetric distance matrix into a condensed 1D distance matrix triag = ‘upper’ or ‘lower’, specifying whether the upper or lower triangle to use. It shouldn’t matter since D is fully symmetric and both should yield identical results.
- plot_mat(data, plot_title, cmap)¶
- generate_dendrogram(cutoff)¶
Perform clustering using the distance matrix
- build_antibody_models(debug)¶
Build antibody models from the fasta sequences
- run_full_workflow(cutoff, debug=False)¶
Main driver that executes the entire workflow
- clean()¶