schrodinger.application.bioluminate.pose_filtering.hdx_io module¶
Utilities for reading and validating HDX-MS CSV files from different vendors.
# Code Vocabulary
Column definition: The column names for a specific HDX-MS CSV file format. Different vendors may use different column names for the same data.
# Scientific Vocabulary
HDX-MS: Hydrogen Deuterium eXchange Mass Spectrometry. This measures the exchange of hydrogen atoms with deuterium atoms in proteins.
Protein State: A unique state of the protein, e.g. bound or unbound. In practice, these can take arbitrary names that users designate as bound or unbound.
Time Slice: The amount of time that the proteins in the HDX-MS experiment are exposed to deuterium.
Uptake: The amount of deuterium that has been incorporated into the protein.
- class schrodinger.application.bioluminate.pose_filtering.hdx_io.DynamXColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.StrEnum
Expected column names for HDX-MS CSV files produced by the DynamX software. Actual CSV file may contain more columns.
- PROTEIN_STATE = 'State'¶
- TIME_SLICE = 'Exposure'¶
- START = 'Start'¶
- END = 'End'¶
- SEQUENCE = 'Sequence'¶
- UPTAKE = 'Uptake'¶
- MAX_UPTAKE = 'MaxUptake'¶
- class schrodinger.application.bioluminate.pose_filtering.hdx_io.HDExaminerColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.StrEnum
Expected column names for HDX-MS files produced by the HDExaminer software. Actual CSV file may contain more columns.
- PROTEIN_STATE = 'Protein State'¶
- TIME_SLICE = 'Deut Time (sec)'¶
- START = 'Start'¶
- END = 'End'¶
- SEQUENCE = 'Sequence'¶
- PERCENT_D = '%D'¶
- class schrodinger.application.bioluminate.pose_filtering.hdx_io.StandardColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.StrEnum
Column names that must be present in all HDX-MS datasets in order to perform analysis. Actual data may contain more columns.
START: The start index of the fragment. END: The end index of the fragment. SEQUENCE: The 1-letter residue code sequence of the fragment. PERCENT_D: The percent deuterium uptake of the fragment. CHAIN: The chain name(s) of the fragment in a given pose.
- START = 'Start'¶
- END = 'End'¶
- SEQUENCE = 'Sequence'¶
- PERCENT_D = '%D'¶
- CHAIN = 'Chain'¶
- class schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.Enum
- RECEPTOR = 'Receptor'¶
- LIGAND = 'Ligand'¶
- exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidSubstructureTypeError(substructure_type: schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType)¶
Bases:
Exception
- __init__(substructure_type: schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType)¶
- exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError(name: str, value: schrodinger.application.bioluminate.pose_filtering.hdx_io.T, valid_properties: set[T])¶
Bases:
Exception
Base class for invalid HDX-MS properties
- __init__(name: str, value: schrodinger.application.bioluminate.pose_filtering.hdx_io.T, valid_properties: set[T])¶
- exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidProteinStateError(protein_state: str, valid_states: set[str])¶
Bases:
schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError
- __init__(protein_state: str, valid_states: set[str])¶
- exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidTimeSliceError(time_slice: float, valid_slices: set[float])¶
Bases:
schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError
- __init__(time_slice: float, valid_slices: set[float])¶
- exception schrodinger.application.bioluminate.pose_filtering.hdx_io.MissingColumnsError(missing_columns: set[str])¶
Bases:
Exception
Raised when a DataFrame is missing required columns.
- __init__(missing_columns: set[str])¶
- class schrodinger.application.bioluminate.pose_filtering.hdx_io.HdxMsProperties(protein_states_recep: set[str], protein_states_lig: set[str], time_slices: set[float])¶
Bases:
NamedTuple
Properties of an HDX-MS dataset.
- Variables
protein_states_recep – All unique protein states for the receptor.
protein_states_lig – All unique protein states for the ligand. May be empty if no ligand data was supplied.
time_slices – Unique time slices present in both the receptor and ligand datasets.
- protein_states_recep: set[str]¶
Alias for field number 0
- protein_states_lig: set[str]¶
Alias for field number 1
- time_slices: set[float]¶
Alias for field number 2
- schrodinger.application.bioluminate.pose_filtering.hdx_io.get_hdx_ms_properties(fname_recep: str, fname_lig: str | None = None) schrodinger.application.bioluminate.pose_filtering.hdx_io.HdxMsProperties ¶
Convenience method for getting the properties of an HDX-MS dataset.
- schrodinger.application.bioluminate.pose_filtering.hdx_io.get_hdx_adapter(fname_recep: str, fname_lig: str | None = None) schrodinger.application.bioluminate.pose_filtering.hdx_io._HdxAdapter ¶
Return the appropriate HDX adapter for the given input files.
- Parameters
fname_recep – The filename of the receptor HDX-MS CSV file.
fname_lig – The filename of the ligand HDX-MS CSV file. May be None.
- schrodinger.application.bioluminate.pose_filtering.hdx_io.get_missing_columns(df: pandas.core.frame.DataFrame, column_def: enum.StrEnum) set[enum.StrEnum] ¶
Return any column headers that are missing from the supplied column definition.
- Parameters
df – A DataFrame containing unprocessed HDX-MS CSV data.