schrodinger.application.transforms.chemio module¶
- class schrodinger.application.transforms.chemio.WriteBatchesOfStructures(file_prefix: Union[str, Path])¶
Bases:
_LocalOnlyPTransform,WriteBatchesOfStructuresA PTransform that writes each element (which is a list of structures) to a separate file with a unique filename. Note that this transform is an identity step that outputs the input PCollection.
- class schrodinger.application.transforms.chemio.ReadFileToStructures(input_file: str)¶
Bases:
PTransformRead Structure objects from a file.
This is a source transform that reads structures from a file.
Example usage in YAML:
pipeline: - type: ReadFileToStructures config: input_file: "ligands.maegz"
- Parameters:
input_file – Path to the structure file to read.
- __init__(input_file: str)¶
- expand(pbegin: PBegin)¶
- class schrodinger.application.transforms.chemio.WriteStructures(output_file: str)¶
Bases:
PTransformWrite Structure objects to a file.
This is a sink transform that writes structures to a file. It passes through the input structures unchanged for chaining.
Example usage in YAML:
pipeline: - type: ReadFileToStructures config: input_file: "ligands.maegz" - type: LigFilter config: criteria: ["Num_heavy_atoms > 5"] - type: WriteStructures config: output_file: "filtered.maegz"
- Parameters:
output_file – Path to the output structure file.
- __init__(output_file: str)¶
- class schrodinger.application.transforms.chemio.CreateStructuresFromSmiles(elements: Iterable[str])¶
Bases:
PTransformCreate Structure objects from a list of SMILES strings.
This is a source transform that creates structures from SMILES.
Example usage in YAML:
pipeline: - type: CreateStructuresFromSmiles config: elements: - "c1ccccc1" - "CCO"
- Parameters:
elements – List of SMILES strings to convert to structures.
- __init__(elements: Iterable[str])¶
- expand(pbegin: PBegin)¶
- class schrodinger.application.transforms.chemio.StructureToRow(label: Optional[str] = None)¶
Bases:
PTransformConvert Structure objects to Beam Rows.
This transform wraps each Structure in a Row with a ‘structure’ field, enabling use of Row-based transforms like MapToFields.
Example usage in YAML:
pipeline: - type: CreateStructuresFromSmiles config: elements: - "c1ccccc1" - type: LigFilter config: criteria: ["Num_heavy_atoms > 3"] - type: StructureToRow - type: MapToFields config: fields: title: callable: "lambda row: row.structure.title"
- class schrodinger.application.transforms.chemio.ReadMolsFromFile(input_file: str, silent: bool = False)¶
Bases:
PTransformRead RDKit Mol objects from a SMILES file.
This is a source transform that reads molecules from a file containing newline-separated SMILES strings.
Example usage in YAML:
pipeline: - type: ReadMolsFromFile config: input_file: "ligands.smi"
- Parameters:
input_file – Path to the SMILES file to read.
silent – If True, suppress warnings for invalid SMILES.
- __init__(input_file: str, silent: bool = False)¶
- expand(pbegin: PBegin)¶
- class schrodinger.application.transforms.chemio.WriteMolsToSmilesFile(output_file: str)¶
Bases:
PTransformWrite RDKit Mol objects to a SMILES file.
This is a sink transform that writes molecules to a file as newline-separated SMILES strings. It passes through the input molecules unchanged for chaining.
Example usage in YAML:
pipeline: - type: CreateMolsFromSmiles config: elements: - "c1ccccc1" - type: WriteMolsToSmilesFile config: output_file: "output.smi"
- Parameters:
output_file – Path to the output SMILES file.
- __init__(output_file: str)¶
- expand(pcoll: PCollection[Mol])¶
- class schrodinger.application.transforms.chemio.CreateMolsFromSmiles(elements: Iterable[str])¶
Bases:
PTransformCreate RDKit Mol objects from a list of SMILES strings.
This is a source transform that creates molecules from SMILES.
Example usage in YAML:
pipeline: - type: CreateMolsFromSmiles config: elements: - "c1ccccc1" - "CCO"
- Parameters:
elements – List of SMILES strings to convert to molecules.
- __init__(elements: Iterable[str])¶
- expand(pbegin: PBegin)¶
- class schrodinger.application.transforms.chemio.ReadFromPDB(pdb_codes: list[str], preserve_caps: bool = False)¶
Bases:
PTransformRead structures from the PDB (Protein Data Bank) by PDB codes.
This is a source transform that downloads PDB files and yields Structure objects with the PDB code stored as property
s_seam_pdb_code.Example usage in YAML:
pipeline: - type: ReadFromPDB config: pdb_codes: - "1A2B" - "3C4D" preserve_caps: false
- Parameters:
pdb_codes – List of PDB codes to download.
preserve_caps – If True, preserve original capitalization.
- PDB_ID_PROPERTY = 's_seam_pdb_code'¶
- __init__(pdb_codes: list[str], preserve_caps: bool = False)¶
- expand(pbegin: PBegin)¶
- class schrodinger.application.transforms.chemio.ReadFromChembl(sample_size: Optional[int] = None)¶
Bases:
PTransformRead molecules from the ChEMBL database.
This is a source transform that fetches molecules from the ChEMBL API and yields Structure objects with the ChEMBL ID stored as property
s_seam_chembl_id.Example usage in YAML:
pipeline: - type: ReadFromChembl config: sample_size: 100
- Parameters:
sample_size – Maximum number of molecules to fetch. If not provided, all available molecules will be fetched.
- CHEMBL_URL = 'https://www.ebi.ac.uk/chembl/api/data/molecule'¶
- DEFAULT_PAGE_SIZE = 1000¶
- CHEMBL_REQUEST_PARAMS = {'format': 'json', 'only': 'molecule_chembl_id,molecule_structures'}¶
- CHEMBL_ID_PROPERTY = 's_seam_chembl_id'¶
- __init__(sample_size: Optional[int] = None)¶
- static create_request(offset: int, limit: int) dict¶
Construct a ChEMBL API request with pagination parameters.
- Parameters:
offset – The starting index from which to fetch molecules.
limit – The maximum number of molecules to fetch in this request.
- expand(pbegin: PBegin)¶