schrodinger.application.bioluminate.pose_filtering.hdx_io module¶

Utilities for reading and validating HDX-MS CSV files from different vendors.

# Code Vocabulary

Column definition: The column names for a specific HDX-MS CSV file format. Different vendors may use different column names for the same data.

# Scientific Vocabulary

HDX-MS: Hydrogen Deuterium eXchange Mass Spectrometry. This measures the exchange of hydrogen atoms with deuterium atoms in proteins.

Protein State: A unique state of the protein, e.g. bound or unbound. In practice, these can take arbitrary names that users designate as bound or unbound.

Time Slice: The amount of time that the proteins in the HDX-MS experiment are exposed to deuterium.

Uptake: The amount of deuterium that has been incorporated into the protein.

class schrodinger.application.bioluminate.pose_filtering.hdx_io.DynamXColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: enum.StrEnum

Expected column names for HDX-MS CSV files produced by the DynamX software. Actual CSV file may contain more columns.

PROTEIN_STATE = 'State'¶

TIME_SLICE = 'Exposure'¶

START = 'Start'¶

END = 'End'¶

SEQUENCE = 'Sequence'¶

UPTAKE = 'Uptake'¶

MAX_UPTAKE = 'MaxUptake'¶

class schrodinger.application.bioluminate.pose_filtering.hdx_io.HDExaminerColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: enum.StrEnum

Expected column names for HDX-MS files produced by the HDExaminer software. Actual CSV file may contain more columns.

PROTEIN_STATE = 'Protein State'¶

TIME_SLICE = 'Deut Time (sec)'¶

START = 'Start'¶

END = 'End'¶

SEQUENCE = 'Sequence'¶

PERCENT_D = '%D'¶

class schrodinger.application.bioluminate.pose_filtering.hdx_io.StandardColumns(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: enum.StrEnum

Column names that must be present in all HDX-MS datasets in order to perform analysis. Actual data may contain more columns.

START: The start index of the fragment. END: The end index of the fragment. SEQUENCE: The 1-letter residue code sequence of the fragment. PERCENT_D: The percent deuterium uptake of the fragment. CHAIN: The chain name(s) of the fragment in a given pose.

START = 'Start'¶

END = 'End'¶

SEQUENCE = 'Sequence'¶

PERCENT_D = '%D'¶

CHAIN = 'Chain'¶

class schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: enum.Enum

RECEPTOR = 'Receptor'¶

LIGAND = 'Ligand'¶

exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidSubstructureTypeError(substructure_type: schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType)¶

Bases: Exception

__init__(substructure_type: schrodinger.application.bioluminate.pose_filtering.hdx_io.SubstructureType)¶

exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError(name: str, value: schrodinger.application.bioluminate.pose_filtering.hdx_io.T, valid_properties: set[T])¶

Bases: Exception

Base class for invalid HDX-MS properties

__init__(name: str, value: schrodinger.application.bioluminate.pose_filtering.hdx_io.T, valid_properties: set[T])¶

exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidProteinStateError(protein_state: str, valid_states: set[str])¶

Bases: schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError

__init__(protein_state: str, valid_states: set[str])¶

exception schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidTimeSliceError(time_slice: float, valid_slices: set[float])¶

Bases: schrodinger.application.bioluminate.pose_filtering.hdx_io.InvalidPropertyError

__init__(time_slice: float, valid_slices: set[float])¶

exception schrodinger.application.bioluminate.pose_filtering.hdx_io.MissingColumnsError(missing_columns: set[str])¶

Bases: Exception

Raised when a DataFrame is missing required columns.

__init__(missing_columns: set[str])¶

class schrodinger.application.bioluminate.pose_filtering.hdx_io.HdxMsProperties(protein_states_recep: set[str], protein_states_lig: set[str], time_slices: set[float])¶

Bases: NamedTuple

Properties of an HDX-MS dataset.

Variables

protein_states_recep – All unique protein states for the receptor.
protein_states_lig – All unique protein states for the ligand. May be empty if no ligand data was supplied.
time_slices – Unique time slices present in both the receptor and ligand datasets.

protein_states_recep: set[str]¶: Alias for field number 0

protein_states_lig: set[str]¶: Alias for field number 1

time_slices: set[float]¶: Alias for field number 2

schrodinger.application.bioluminate.pose_filtering.hdx_io.get_hdx_ms_properties(fname_recep: str, fname_lig: str | None = None) → schrodinger.application.bioluminate.pose_filtering.hdx_io.HdxMsProperties¶: Convenience method for getting the properties of an HDX-MS dataset.

schrodinger.application.bioluminate.pose_filtering.hdx_io.get_hdx_adapter(fname_recep: str, fname_lig: str | None = None) → schrodinger.application.bioluminate.pose_filtering.hdx_io._HdxAdapter¶

Return the appropriate HDX adapter for the given input files.

Parameters

fname_recep – The filename of the receptor HDX-MS CSV file.
fname_lig – The filename of the ligand HDX-MS CSV file. May be None.

schrodinger.application.bioluminate.pose_filtering.hdx_io.get_missing_columns(df: pandas.core.frame.DataFrame, column_def: enum.StrEnum) → set[enum.StrEnum]¶

Return any column headers that are missing from the supplied column definition.

Parameters: df – A DataFrame containing unprocessed HDX-MS CSV data.