schrodinger.seam.io.filesystems module

FileSystems utility functions for interfacing with different filesystems.

Most functionality can be found in apache_beam.io.filesystems, but this module provides some additional functionality not available in the Beam API.

schrodinger.seam.io.filesystems.is_gcs(path: str | pathlib.Path) bool

Check if the path is a Google Cloud Storage path.

schrodinger.seam.io.filesystems.is_jobfs(path: str | pathlib.Path) bool

Check if the path uses the JobFileSystem.

schrodinger.seam.io.filesystems.is_local_filesystem(path: str | pathlib.Path) bool

Check if the path uses a local filesystem (not GCS or JobFS).

schrodinger.seam.io.filesystems.xcopy(source_file_names: list[str], destination_file_names: list[str], mime_type='application/octet-stream', compression_type='auto', chunk_size: int = 65536)

Copy files from source to destination.

Similar to apache_beam.io.filesystems.copy, but allows for copying between different filesystems (e.g. local to jobfs).

Parameters:
  • source_file_names – List of source file paths.

  • destination_file_names – List of destination file paths.

  • mime_type – MIME type for file operations.

  • compression_type – Compression type for file operations.

  • chunk_size – Size of chunks to read/write at a time in bytes. Defaults to shutil.COPY_BUFSIZE.

schrodinger.seam.io.filesystems.localize(path: str) str

Given a path, return a local path to the file.

It’s the responsibility of the caller to delete the file after use if necessary.

class schrodinger.seam.io.filesystems.FSStructureReader(filename, index=1)

Bases: StructureReader

A StructureReader that can read from remote filesystems.

Behaves like a normal StructureReader, but will copy GCS files to a temporary location before reading. The temporary file will be deleted after the reader is closed.

Local files and jobfs files are read directly without copying.

__init__(filename, index=1)
close()
class schrodinger.seam.io.filesystems.FSStructureWriter(filename: str | pathlib.Path, overwrite=True, format=None, stereo=None, allow_empty_file=False)

Bases: StructureWriter

A StructureWriter that can write to remote filesystems.

For GCS, writes to a temporary local file first, then copies to the final destination on close. For jobfs, writes directly to the local jobfs path. For local files, writes directly.

__init__(filename: str | pathlib.Path, overwrite=True, format=None, stereo=None, allow_empty_file=False)

Create a structure writer class based on the format.

Parameters:
  • filename (str or pathlib.Path) – The filename to write to.

  • overwrite (bool) – If False, append to an existing file instead of overwriting it.

  • format (str) – The format of the file. Values should be specified by one of the module-level constants MAESTRO, MOL2, SD, SMILES, or SMILESCSV. If the format is not explicitly specified it will be determined from the suffix of the filename. Multi-structure PDB files are not supported.

  • stereo (enum) –

    Use of the stereo option in the constructor is pending deprecation. Please use the setOption method instead.

    See the class docstring for documentation on the stereo options.

  • allow_empty_file (bool) – whether we should create a file with no structures if we don’t append any structures. Only a valid option for Maestro files.

close()

Close the file.