schrodinger.seam.io.chemio module

Transforms for reading and writing structures and molecules.

class schrodinger.seam.io.chemio.ReadStructuresFromFile(file_pattern: Union[str, pathlib.Path], **kwargs)

Bases: schrodinger.seam.transforms._resources._LocalOnlyPTransform, schrodinger.seam.io.chemio.ReadStructuresFromFile

Read a file (or files) containing a structure or a list of structures and return a PCollection of schrodinger.structure.Structure objects.

Example: >>> from schrodinger.test import mmshare_data_file >>> with beam.Pipeline() as p: … _ = (p … | ReadStructuresFromFile(mmshare_data_file(‘cookbook/stereoisomers-form-1.maegz’)) … | beam.Map(lambda st: st.title) … | textio.WriteToText(‘titles.txt’)) >>> with open(‘titles.txt’) as f: … titles = sorted(set(line.strip() for line in f)) >>> titles # doctest: +ELLIPSIS [‘stereoisomers-1-form-1’, ‘stereoisomers-2-form-1’, …]

class schrodinger.seam.io.chemio.ReadAllStructuresFromFile(label: Optional[str] = None)

Bases: schrodinger.seam.transforms._resources._LocalOnlyPTransform, schrodinger.seam.io.chemio.ReadAllStructuresFromFile

A PTransform for reading a PCollection of structure files.

Example: >>> from schrodinger.test import mmshare_data_file >>> with beam.Pipeline() as p: … _ = (p … | beam.Create([mmshare_data_file(‘cookbook/stereoisomers-form-1.maegz’)]) … | ReadAllStructuresFromFile() … | beam.Map(lambda st: st.title) … | textio.WriteToText(‘titles.txt’)) >>> with open(‘titles.txt’) as f: … titles = sorted(set(line.strip() for line in f)) >>> titles # doctest: +ELLIPSIS [‘stereoisomers-1-form-1’, ‘stereoisomers-2-form-1’, …]

class schrodinger.seam.io.chemio.WriteStructuresToFile(file_name: Union[str, pathlib.Path], **kwargs)

Bases: schrodinger.seam.transforms._resources._LocalOnlyPTransform, schrodinger.seam.io.chemio.WriteStructuresToFile

Write a PCollection of schrodinger.structure.Structure objects to a file.

The file format is determined by the file extension. See schrodinger.structure.StructureWriter for more details.

Example: >>> from schrodinger import structure >>> from pathlib import Path >>> outfile = Path(‘out.maegz’) >>> outfile.unlink(missing_ok=True) >>> with beam.Pipeline() as p: … sts = [structure.create_new_structure(num_atoms=i) for i in range(1, 11)] … _ = (p | beam.Create(sts) | WriteStructuresToFile(outfile)) >>> sts = list(structure.StructureReader(outfile)) >>> len(sts) 10

Raises

ValueError – if the file already exists

class schrodinger.seam.io.chemio.ReadMolsFromFile(file_pattern: Union[str, pathlib.Path], silent=False, **kwargs)

Bases: schrodinger.seam.transforms._resources._LocalOnlyPTransform, schrodinger.seam.io.chemio.ReadMolsFromFile

Read a file containing a newline separated list of SMILES strings and return a PCollection of RDKit molecules.

Invalid SMILES strings are skipped. A warning is printed if silent is set to False.

Example: >>> from pathlib import Path >>> infile = Path(‘test.smi’) >>> _ = infile.write_text(“CnCCnCCC”) >>> Path(‘num_atoms.txt’).unlink(missing_ok=True) >>> with beam.Pipeline() as p: … _ = (p … | ReadMolsFromFile(infile) … | beam.Map(lambda m: m.GetNumHeavyAtoms()) … | textio.WriteToText(‘num_atoms.txt’)) >>> with open(‘num_atoms.txt’) as f: … num_atoms = sorted(int(line.strip()) for line in f) >>> num_atoms [1, 2, 3]

class schrodinger.seam.io.chemio.WriteMolsToFile(file_name: str, **kwargs)

Bases: schrodinger.seam.transforms._resources._LocalOnlyPTransform, schrodinger.seam.io.chemio.WriteMolsToFile

Write a PCollection of RDKit molecules to a file as a newline separated list of SMILES strings.

Example: >>> from rdkit import Chem >>> from pathlib import Path >>> outfile = Path(‘test.smi’) >>> outfile.unlink(missing_ok=True) >>> with beam.Pipeline() as p: … mols = [Chem.MolFromSmiles(‘C’ * i) for i in range(1, 4)] … _ = (p | beam.Create(mols) | WriteMolsToFile(outfile)) >>> with open(outfile) as f: … smiles = sorted(line.strip() for line in f) >>> smiles [‘C’, ‘CC’, ‘CCC’]

Raises

ValueError – if the file already exists