schrodinger.utils.fileutils module¶
A module of file utilities to deal with common file issues.
NOTE: This module is used in scripts that need to be able to run without a Schrodinger license, and therefore can’t depend on the pymmlibs.
The force_remove and force_rename functions deal with the fact that os.remove() and os.rename() don’t work on Windows if write permissions are not enabled.
Copyright Schrodinger LLC, All Rights Reserved.
- exception schrodinger.utils.fileutils.SharingViolationError¶
Bases:
PermissionError
- __init__(*args, **kwargs)¶
- args¶
- characters_written¶
- errno¶
POSIX exception code
- filename¶
exception filename
- filename2¶
second exception filename
- strerror¶
exception strerror
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- schrodinger.utils.fileutils.force_remove(*args)¶
Remove each file in ‘args’ in a platform independent way without an exception, regardless of presence of the file or the lack of write permission.
- Parameters
args (str) – the pathname for the files to remove
- schrodinger.utils.fileutils.force_rmtree(dirname: Union[str, pathlib.Path], ignore_errors: bool = False)¶
Remove the directory ‘dirname’, using force_remove to remove any difficult to remove files or sub-directories.
- Parameters
dirname – the directory to remove
ignore_errors – If True, silently ignore errors, otherwise raise OSError
- schrodinger.utils.fileutils.force_rename(old: Union[pathlib.Path, str], new: Union[pathlib.Path, str])¶
Rename a file, even if a file at the new name exists, and even if that file doesn’t have write permission, and even if old and new are on different devices.
- Parameters
old – Path to the file source.
new – Path to the file destination.
- Note
Renaming may not be an atomic operation. If the ‘new’ file exists then it is first removed then renamed in two operations. Similarly, if old and new are not on the same device then the file is copied to ‘new’ then the ‘old’ file is removed.
- schrodinger.utils.fileutils.force_copy2(*args)¶
Same as shutil.copy2 but don’t raise shutil.SameFileError.
- schrodinger.utils.fileutils.splitext(p: str) Tuple[str, str] ¶
Split the extension from a pathname. Returns “(root, ext)”. Equivalent to os.path.splitext(), except that for gzip compressed files, such as *.mae.gz files, “.mae.gz” is split off instead of “.gz”. *.sdf.gz, *.sd.gz, *.mol.gz
- Parameters
p – a pathname
- Returns
The root filename and the file extension.
- class schrodinger.utils.fileutils.SeqFormat(value)¶
Bases:
enum.Enum
An enumeration.
- fasta = 1¶
- swissprot = 2¶
- gcg = 3¶
- embl = 4¶
- pir = 5¶
- clustal = 6¶
- csv = 7¶
- schrodinger.utils.fileutils.get_file_extension(filename)¶
Return the file extension of the given file, including any suffixes prior to “.gz” extension.
For example:
assert get_file_extension('myfile.txt') == '.txt' assert get_file_extension('test.mae.gz') == '.mae.gz'
- Parameters
filename – File name to detect the format
- Type
str
- Returns
format of the file.
- Return type
str
- schrodinger.utils.fileutils.get_file_format(filename)¶
- schrodinger.utils.fileutils.get_structure_file_format(filename: str) Optional[str] ¶
Return the format of a structure file, based on the filename extension. None is returned if the file extension is not recognized.
- Parameters
filename – Filename to detect format
- Returns
File format or None if not recognized
- schrodinger.utils.fileutils.get_sequence_file_format(filename: str) Optional[str] ¶
Return the format of a sequence file, based on the filename extension. None is returned if the file extension is not recognized.
- Parameters
filename – Filename to detect format
- Returns
File format or None if not recognized
- schrodinger.utils.fileutils.get_name_filter(name_mapping: Dict[str, List[str]]) List[str] ¶
Create filename filters for QFileDialog
- Parameters
name_mapping – Mapping between category name and list of file types (must be keys of
EXTENSIONS
)- Returns
List of filename filters
- schrodinger.utils.fileutils.is_pdb_file(filename: str) bool ¶
Returns whether the specified filename represents a PDB file.
- Parameters
filename – a filename
- Returns
Whether the file is a pdb file.
- schrodinger.utils.fileutils.is_maestro_file(filename: str) bool ¶
Returns True if specified filename represents a Maestro file.
- Parameters
filename – a filename
- Returns
Is this filename a maestro file?
- schrodinger.utils.fileutils.is_sd_file(filename: str) bool ¶
Returns True if specified filename represents a SD file.
- Parameters
filename – a filename
- Returns
Is this filename an SD file?
- schrodinger.utils.fileutils.is_csv_file(filename: str) bool ¶
Returns True if specified filename represents a CSV file.
- Parameters
filename – a filename
- Returns
Is this filename a csv file?
- schrodinger.utils.fileutils.is_smiles_file(filename: str) bool ¶
Returns True if specified filename represents a Smiles file.
- Parameters
filename – a filename
- Returns
Is this filename a smiles file?
- schrodinger.utils.fileutils.is_poseviewer_file(filename: str) bool ¶
Determines whether the filename follows Pose Viewer file naming conventions.
Effectively, this checks whether the file name ends with a ‘_pv’ or ‘_epv’ followed by a Maestro file extension (‘.mae’, ‘.mae.gz’, or ‘.maegz’). Roughly equivalent to the regular expression r’_e?pv.mae(.?gz)?$’.
- schrodinger.utils.fileutils.is_epv_file(filename: str) bool ¶
Determines whether a filename follows Extended Pose Viewer (EPV) file naming conventions.
- schrodinger.utils.fileutils.split_ext_pv(filename: str)¶
Return stem and extension, while accounting for compression and ‘_pv’ or ‘_epv’ as part of the extension.
For example:
split_ext_pv('/path/to/foo_pv.mae.gz') # -> ('foo', '_pv.mae.gz')
- Returns
A tuple with the stem and extension. The extension portion will include ‘_pv’ or ‘_epv’, if present.
- Return type
tuple[str, str]
- schrodinger.utils.fileutils.is_cms_file(filename: str) bool ¶
Returns True if specified filename represent a CMS file.
- Parameters
filename – a filename
- Returns
Is this filename a CMS file?
- schrodinger.utils.fileutils.is_hypothesis_file(filename: str) bool ¶
Returns True if specified filename represents a Phase hypothesis file. The .phypo extension corresponds to a gzipped Maestro file containing a single ct which is a Phase hypothesis.
- Parameters
filename – a filename
- Returns
Is this filename a Phase hypothesis file?
- schrodinger.utils.fileutils.strip_extension(filename: str) str ¶
Return a new file path without extension. Suffixes such as “_pv” and “_epv” are also removed.
- schrodinger.utils.fileutils.get_basename(filename: str) str ¶
Returns the final component of specified path name minus the extension. Suffixes such as “_pv” and “_epv” are also stripped.
- schrodinger.utils.fileutils.is_gzipped_structure_file(filename: str) bool ¶
Returns True if the filename represents a file that is GZipped and it has a recognized structure extension.
- Parameters
filename – a filename
- Returns
Is this filename a gzipped structure file?
- schrodinger.utils.fileutils.is_valid_jobname(jobname: str) bool ¶
Returns True if specified job name is valid, does not contain any illegal characters, and does not start with “.”.
- schrodinger.utils.fileutils.get_jobname(filename: str) str ¶
Returns a job name derived from the specified filename. Same as get_basename(), except that illegal characters are removed.
- schrodinger.utils.fileutils.get_next_filename_prefix(path: str, midfix: str, zfill_width: int = 0) str ¶
Return next filename prefix in series <root><midfix><number>.
Given a path (absolute or relative) to a filename or filename prefix, return the next prefix in the sequence implied by path and midfix. For example, with a path of /full/path/to/foo.mae, path/to/foo.mae or foo.mae, or /full/path/to/foo, path/to/foo or foo, and a midfix of ‘-’, this function will return “foo-3” if any file whose prefix foo-2 (and no higher-numbered foo-*) is present. It will return foo-1 if no file whose prefix is foo-<number> is present. The net effect is that any file-name extension in the path argument will be ignored.
This function differs from next_filename() in that here, all files sharing the prefix contained in the path are searched, regardless of extension, and the next filename prefix is returned.
The search is case sensitive or not depending on the semantics of the file system. The leading directory of the path, if any, is included in the return value.
Usage note: you might use this when the filename prefix could be exhibited by many files and you don’t want to overwrite any of them. For example, you are starting up a job which will create many files with the same prefix.
- schrodinger.utils.fileutils.get_next_filename(path: str, midfix: str, zfill_width: int = 0)¶
Return next filename in series <root><midfix><number>.<ext>.
Given a path (absolute or relative) to a filename, return the next filename in the sequence implied by path and midfix. For example, with a path of /full/path/to/foo.mae, path/to/foo.mae or foo.mae and a midfix of ‘-’, this function will return “foo-3.mae” if file foo-2.mae (and no higher-numbered foo-*.mae) is present. It will return foo-1.mae if no file named foo-<number>.mae is present.
This function differs from next_filename_prefix() in that here, only files with the specified extension are searched, and the next full filename is retured.
The search is case sensitive or not depending on the semantics of the file system. The leading directory of the path, if any, is included in the return value.
Usage note: You might use this when you are expecting to update only a single file: the one whose filename is given in the path. For example, you are exporting structures to a .mae file and you want to pick a non-conflicting name based on a user’s filename specification.
Return the path to the local $SCHRODINGER/mmshare-*/ directory
- Returns
Path to the “mmshare” directory.
Return the path of the local $SCHRODINGER/mmshare-*/data/ directory.
- Returns
Path to the “data” directory.
Return the path of the $SCHRODINGER/mmshare-*/python/scripts/ directory.
- Returns
Path to the “scripts” directory.
Return the path of the $SCHRODINGER/mmshare-*/python/common/ directory.
- Returns
Path to the “common” directory.
- schrodinger.utils.fileutils.get_docs_dir() str ¶
Return the path to the local $SCHRODINGER/docs/ directory
- Returns
Path to the “docs” directory.
- schrodinger.utils.fileutils.get_directory_path(which_directory) str ¶
This function returns the schrodinger specific directory.
If an invalid which_directory is specified, then a TypeError is thrown.
Valid directories are:
HOME : To get user’s home dir
APPDATA : To get the Schrodinger application shared data dir
LOCAL_APPDATA : To get the Schrodinger application local data dir
USERDATA : To get user’s data dir
TEMP : To get default temporary data dir
DESKTOP : To get user’s desktop dir
DOCUMENTS : To get user’s ‘My Documents’ dir
NETWORK : To get user’s ‘My Network places’ dir (only for Windows)
- Return type
str
- Returns
Directory path
- schrodinger.utils.fileutils.get_directory(which_directory) -> (<class 'int'>, <class 'str'>)¶
- Deprecated
Because this function behaves in a non-standard way by returning an mmlib status,
get_directory_path
is preferred.
- schrodinger.utils.fileutils.get_home_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_appdata_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_local_appdata_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_desktop_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_mydocuments_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_mynetworkplaces_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_userdata_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.get_schrodinger_temp_dir() str ¶
- Deprecated
get_directory_path should be used instead.
- schrodinger.utils.fileutils.locate_darwin_pymol() Optional[str] ¶
Return path to Pymol on a MacOS system. Return None if no Pymol installations are found.
- schrodinger.utils.fileutils.locate_pymol() Optional[str] ¶
Find the executable or script we use to launch PyMOL.
- Returns
The pymol launch command or None if PyMOL was not found
- schrodinger.utils.fileutils.get_pymol_cmd(use_x11: bool = False) List[str] ¶
Get a cmd list for launching Pymol. This may include extra platform- specific arguments.
- Parameters
use_x11 – if True causes -m to be added to the launch command on Mac
- Returns
a cmd list with the executable as first element and any other options following it.
- class schrodinger.utils.fileutils.chdir(dirname: Union[pathlib.Path, str])¶
Bases:
object
A context manager that carries out commands inside of the specified directory and restores the current directory when done.
- __init__(dirname: Union[pathlib.Path, str])¶
- schrodinger.utils.fileutils.create_hard_link(source: str, link_name: str)¶
Create a hard link pointing to source named link_name.
On Windows, uses CreateHardLinkA() and will raise RuntimeError() on failure.
On other OSes uses os.link(), and will raise OSError on failure.
- schrodinger.utils.fileutils.mkdir_p(path: str, *mode)¶
- Deprecated
use
os.makedirs(path, exist_ok=True)
- class schrodinger.utils.fileutils.tempfilename(prefix='tmp', suffix='', temp_dir=None)¶
Bases:
str
- remove()¶
- __contains__(key, /)¶
Return key in self.
- __len__()¶
Return len(self).
- capitalize()¶
Return a capitalized version of the string.
More specifically, make the first character have upper case and the rest lower case.
- casefold()¶
Return a version of the string suitable for caseless comparisons.
- center(width, fillchar=' ', /)¶
Return a centered string of length width.
Padding is done using the specified fill character (default is a space).
- count(sub[, start[, end]]) int ¶
Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.
- encode(encoding='utf-8', errors='strict')¶
Encode the string using the codec registered for encoding.
- encoding
The encoding in which to encode the string.
- errors
The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.
- endswith(suffix[, start[, end]]) bool ¶
Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.
- expandtabs(tabsize=8)¶
Return a copy where all tab characters are expanded using spaces.
If tabsize is not given, a tab size of 8 characters is assumed.
- find(sub[, start[, end]]) int ¶
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- format(*args, **kwargs) str ¶
Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).
- format_map(mapping) str ¶
Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).
- index(sub[, start[, end]]) int ¶
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- isalnum()¶
Return True if the string is an alpha-numeric string, False otherwise.
A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.
- isalpha()¶
Return True if the string is an alphabetic string, False otherwise.
A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.
- isascii()¶
Return True if all characters in the string are ASCII, False otherwise.
ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.
- isdecimal()¶
Return True if the string is a decimal string, False otherwise.
A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.
- isdigit()¶
Return True if the string is a digit string, False otherwise.
A string is a digit string if all characters in the string are digits and there is at least one character in the string.
- isidentifier()¶
Return True if the string is a valid Python identifier, False otherwise.
Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.
- islower()¶
Return True if the string is a lowercase string, False otherwise.
A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.
- isnumeric()¶
Return True if the string is a numeric string, False otherwise.
A string is numeric if all characters in the string are numeric and there is at least one character in the string.
- isprintable()¶
Return True if the string is printable, False otherwise.
A string is printable if all of its characters are considered printable in repr() or if it is empty.
- isspace()¶
Return True if the string is a whitespace string, False otherwise.
A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.
- istitle()¶
Return True if the string is a title-cased string, False otherwise.
In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.
- isupper()¶
Return True if the string is an uppercase string, False otherwise.
A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.
- join(iterable, /)¶
Concatenate any number of strings.
The string whose method is called is inserted in between each given string. The result is returned as a new string.
Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’
- ljust(width, fillchar=' ', /)¶
Return a left-justified string of length width.
Padding is done using the specified fill character (default is a space).
- lower()¶
Return a copy of the string converted to lowercase.
- lstrip(chars=None, /)¶
Return a copy of the string with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
- static maketrans()¶
Return a translation table usable for str.translate().
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.
- partition(sep, /)¶
Partition the string into three parts using the given separator.
This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing the original string and two empty strings.
- replace(old, new, count=- 1, /)¶
Return a copy with all occurrences of substring old replaced by new.
- count
Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are replaced.
- rfind(sub[, start[, end]]) int ¶
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- rindex(sub[, start[, end]]) int ¶
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- rjust(width, fillchar=' ', /)¶
Return a right-justified string of length width.
Padding is done using the specified fill character (default is a space).
- rpartition(sep, /)¶
Partition the string into three parts using the given separator.
This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing two empty strings and the original string.
- rsplit(sep=None, maxsplit=- 1)¶
Return a list of the words in the string, using sep as the delimiter string.
- sep
The delimiter according which to split the string. None (the default value) means split according to any whitespace, and discard empty strings from the result.
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
Splits are done starting at the end of the string and working to the front.
- rstrip(chars=None, /)¶
Return a copy of the string with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- split(sep=None, maxsplit=- 1)¶
Return a list of the words in the string, using sep as the delimiter string.
- sep
The delimiter according which to split the string. None (the default value) means split according to any whitespace, and discard empty strings from the result.
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
- splitlines(keepends=False)¶
Return a list of the lines in the string, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends is given and true.
- startswith(prefix[, start[, end]]) bool ¶
Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.
- strip(chars=None, /)¶
Return a copy of the string with leading and trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- swapcase()¶
Convert uppercase characters to lowercase and lowercase characters to uppercase.
- title()¶
Return a version of the string where each word is titlecased.
More specifically, words start with uppercased characters and all remaining cased characters have lower case.
- translate(table, /)¶
Replace each character in the string using the given translation table.
- table
Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.
The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.
- upper()¶
Return a copy of the string converted to uppercase.
- zfill(width, /)¶
Pad a numeric string with zeros on the left, to fill a field of the given width.
The string is never truncated.
- class schrodinger.utils.fileutils.TempStructureFile(sts)¶
Bases:
schrodinger.utils.fileutils.tempfilename
- __contains__(key, /)¶
Return key in self.
- __len__()¶
Return len(self).
- capitalize()¶
Return a capitalized version of the string.
More specifically, make the first character have upper case and the rest lower case.
- casefold()¶
Return a version of the string suitable for caseless comparisons.
- center(width, fillchar=' ', /)¶
Return a centered string of length width.
Padding is done using the specified fill character (default is a space).
- count(sub[, start[, end]]) int ¶
Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.
- encode(encoding='utf-8', errors='strict')¶
Encode the string using the codec registered for encoding.
- encoding
The encoding in which to encode the string.
- errors
The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.
- endswith(suffix[, start[, end]]) bool ¶
Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.
- expandtabs(tabsize=8)¶
Return a copy where all tab characters are expanded using spaces.
If tabsize is not given, a tab size of 8 characters is assumed.
- find(sub[, start[, end]]) int ¶
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- format(*args, **kwargs) str ¶
Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).
- format_map(mapping) str ¶
Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).
- index(sub[, start[, end]]) int ¶
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- isalnum()¶
Return True if the string is an alpha-numeric string, False otherwise.
A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.
- isalpha()¶
Return True if the string is an alphabetic string, False otherwise.
A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.
- isascii()¶
Return True if all characters in the string are ASCII, False otherwise.
ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.
- isdecimal()¶
Return True if the string is a decimal string, False otherwise.
A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.
- isdigit()¶
Return True if the string is a digit string, False otherwise.
A string is a digit string if all characters in the string are digits and there is at least one character in the string.
- isidentifier()¶
Return True if the string is a valid Python identifier, False otherwise.
Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.
- islower()¶
Return True if the string is a lowercase string, False otherwise.
A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.
- isnumeric()¶
Return True if the string is a numeric string, False otherwise.
A string is numeric if all characters in the string are numeric and there is at least one character in the string.
- isprintable()¶
Return True if the string is printable, False otherwise.
A string is printable if all of its characters are considered printable in repr() or if it is empty.
- isspace()¶
Return True if the string is a whitespace string, False otherwise.
A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.
- istitle()¶
Return True if the string is a title-cased string, False otherwise.
In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.
- isupper()¶
Return True if the string is an uppercase string, False otherwise.
A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.
- join(iterable, /)¶
Concatenate any number of strings.
The string whose method is called is inserted in between each given string. The result is returned as a new string.
Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’
- ljust(width, fillchar=' ', /)¶
Return a left-justified string of length width.
Padding is done using the specified fill character (default is a space).
- lower()¶
Return a copy of the string converted to lowercase.
- lstrip(chars=None, /)¶
Return a copy of the string with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
- static maketrans()¶
Return a translation table usable for str.translate().
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.
- partition(sep, /)¶
Partition the string into three parts using the given separator.
This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing the original string and two empty strings.
- remove()¶
- replace(old, new, count=- 1, /)¶
Return a copy with all occurrences of substring old replaced by new.
- count
Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are replaced.
- rfind(sub[, start[, end]]) int ¶
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- rindex(sub[, start[, end]]) int ¶
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- rjust(width, fillchar=' ', /)¶
Return a right-justified string of length width.
Padding is done using the specified fill character (default is a space).
- rpartition(sep, /)¶
Partition the string into three parts using the given separator.
This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing two empty strings and the original string.
- rsplit(sep=None, maxsplit=- 1)¶
Return a list of the words in the string, using sep as the delimiter string.
- sep
The delimiter according which to split the string. None (the default value) means split according to any whitespace, and discard empty strings from the result.
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
Splits are done starting at the end of the string and working to the front.
- rstrip(chars=None, /)¶
Return a copy of the string with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- split(sep=None, maxsplit=- 1)¶
Return a list of the words in the string, using sep as the delimiter string.
- sep
The delimiter according which to split the string. None (the default value) means split according to any whitespace, and discard empty strings from the result.
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
- splitlines(keepends=False)¶
Return a list of the lines in the string, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends is given and true.
- startswith(prefix[, start[, end]]) bool ¶
Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.
- strip(chars=None, /)¶
Return a copy of the string with leading and trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- swapcase()¶
Convert uppercase characters to lowercase and lowercase characters to uppercase.
- title()¶
Return a version of the string where each word is titlecased.
More specifically, words start with uppercased characters and all remaining cased characters have lower case.
- translate(table, /)¶
Replace each character in the string using the given translation table.
- table
Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.
The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.
- upper()¶
Return a copy of the string converted to uppercase.
- zfill(width, /)¶
Pad a numeric string with zeros on the left, to fill a field of the given width.
The string is never truncated.
- schrodinger.utils.fileutils.cat(source_filenames: List[str], dest_filename: str)¶
Concatenate the contents of the source files, writing them to a destination file. All files are specified by name. If source_filenames is an empty list, an empty file is produced.
- Parameters
source_filenames – input files
dest_filename – destination file
- schrodinger.utils.fileutils.cat_flat_files(source_filenames: Iterable[str], dest_filename: str)¶
Combine multiple flat files (such as CSVs) into one large file.
Expects each source file to contain a header line and will write the header from the first source file into the destination file. Any file specified may be compressed.
- Parameters
source_filenames – A list of paths to source files.
dest_filename – The name of the destination file.
- schrodinger.utils.fileutils.tar_files(tarname: str, mode: str, files: List[str])¶
Writes files to tar archive.
- Parameters
tarname – Tar file name.
mode – File open mode.
files – Iterable over file names to be added to the archive.
- schrodinger.utils.fileutils.zip_files(zipname: str, mode: str, files: List[str])¶
Writes files to tar archive.
- Parameters
zipname – Zip file name.
mode – File open mode.
files – Iterable over file names to be added to the archive.
- schrodinger.utils.fileutils.is_within_directory(directory, afile)¶
- schrodinger.utils.fileutils.safe_extractall_tar(tar, path='.', *args, **kwargs)¶
Extract all files from a tar file. Please see Python Vulnerability: CVE-2007-4559 for details on issue with tar.extractall() method. See tar.extractall method description for details on args and kwargs.
- Parameters
tar (tarfile.TarFile) – TarFile object
path (str) – path of directory where tarfile will be extracted
- schrodinger.utils.fileutils.safe_extractall_zip(zip_file, path='.', *args, **kwargs)¶
Extract all files from a zip file. Please see Python Vulnerability: CVE-2007-4559 for details on issue with zip.extractall() method. See zip.extractall method description for details on args and kwargs.
- Parameters
zip_file (zipfile.ZipFile) – ZipFile object
path (str) – path of directory where tarfile will be extracted
- schrodinger.utils.fileutils.on_same_drive_letter(path_a: str, path_b: str) bool ¶
Returns true if path_a and path_b are on the same driveletter. On systems without drive letters, always return True.
- schrodinger.utils.fileutils.get_files_from_folder(folder_abs_path: str) List[Tuple[str, str]] ¶
Walk through a folder, find all files inside it.
- Parameters
folder_abs_path – folder path
- Returns
each tuple contains: absolute path of a file, and a relative path that the file will be transferred to.
- schrodinger.utils.fileutils.change_working_directory(folder: Union[pathlib.Path, str])¶
A context manager to temporarily change the working directory to folder :param folder: the folder that becomes the working directory
- schrodinger.utils.fileutils.in_temporary_directory()¶
A context manager for executing a block of code in a temporary directory.
- schrodinger.utils.fileutils.mmfile_path(path: Optional[str] = None)¶
Context manager and decorator that resets the mmfile search path on exit. If the optional
path
is supplied, it is set on entry.- Parameters
path – mmfile path to set while in the context
- schrodinger.utils.fileutils.count_lines(filename: str) int ¶
Count the number of newlines in a file, in a way similar to “wc -l”.
- Parameters
filename – input filename
- Returns
number of newlines in file
- schrodinger.utils.fileutils.get_directory_size(dirpath)¶
Get the size of the given directory in MB
(Note: MB => 1e6 bytes)
- Parameters
dirpath (str) – The path to the directory
- Return type
float
- Returns
The size of the directory in MB
- schrodinger.utils.fileutils.get_existing_filepath(path_file: str) Optional[str] ¶
Check and find the path/file either at the given path, in the current working directory, and the original launch directory. The first found path is returned.
This can be useful when the file has been copied from path_file to the CWD, such as when launchapi copies a file from an absolute path on the local machine into the job directory on a remote machine.
This can also be useful when large files (e.g. trajectory) file are not copied from path_file to the job launch dir for localhost jobs. The job in the current launch dir can access the files in the original launch dir.
- Returns
None if the file cannot be located
- schrodinger.utils.fileutils.xyz_to_sdf(xyz_filepath: str, out_sdf: Optional[str] = None, save_file: bool = True) str ¶
Convert a XYZ format file to sdf one.
- Parameters
xyz_filepath – filename with path
out_sdf – the output sdf filename if provided. If None means the out_sdf is auto-set based on input filename
save_file – If false, the output information is written to stdout instead of a file.
- Returns
the output sdf filename
- Raises
ValueError – input file is of wrong extension
RuntimeError – failed to convert the xyz file
- schrodinger.utils.fileutils.open_maybe_compressed(filename: str, *a, **d) io.IOBase ¶
Open a file, using the gzip module if the filename ends in gz, or the builtin open otherwise. All arguments are passed through.
- schrodinger.utils.fileutils.get_csv_file_column_count(csv_file: str) int ¶
Return the number of columns in the csv file. :param csv_file: CSV file path. :return: Number of columns in the csv file.
- schrodinger.utils.fileutils.hash_for_file(path, algorithm=<built-in function openssl_md5>, buff_size=8388608)¶
Get file hash.
- Parameters
path (str) – File path
algorithm (method) – Algorithm to use
buff_size (int) – Buffer size
- Return type
str
- Returns
File hash
- schrodinger.utils.fileutils.extended_windows_path(dos_path, only_if_required=True)¶
Convert path to absolute path and prepend extended path tag to paths on Windows
- Parameters
dos_path (str) – a Windows file path, which may be longer than 256 characters and therefore invalid
only_if_required (bool) – Whether to append windows extended path tag to to file paths that do not exceed WINDOWS_MAX_PATH in length.
- Return type
string
- Returns
An Windows extended file path which can accommodate 30000+ characters
- schrodinger.utils.fileutils.slugify(text)¶
Slugifies a filename for use in a URL or file name.
Based on the Django implementation. (https://github.com/django/django/blob/dcebc5da4831d2982b26d00a9480ad538b5c5acf/django/utils/text.py#L400)
- Parameters
text (str) – Text to slugify
- Returns
Slugified text
- Return type
str
- schrodinger.utils.fileutils.is_subpath(path, parent_dir, strict=False)¶
Returns whether the specified path is a subdir of the specified parent directory.
- Parameters
strict (bool) – if False, the parent_dir is considered a subpath of itself. Set to True so only actual subpaths qualify as paths.
- schrodinger.utils.fileutils.split_file_round_robin(infile, outfiles, has_header)¶
Chunks a larger file into smaller files by systematically sampling every k-th input line into the k-th output file. To be used with flat data files such as CSV or SMI files.
- Parameters
infile (str) – The input file to be split.
outfiles (list[str]) – The files to be written to. Will be overwrite any file that already exists.
has_header (bool) – Whether the input file has a header line.
- class schrodinger.utils.fileutils.MultiFileReader(files, *, have_header=True)¶
Bases:
object
An iterator context manager to read in a collection of flat files. Files are logically concatenated so that all records are treated as one large file. Supports a mixture of compressed and uncompressed files.
- Variables
header (Optional[str]) – If ‘have_header’ is set, this will contain the first line of the last file that was read in. I.e., the header line.
- __init__(files, *, have_header=True)¶
- Parameters
files (Iterable[str]) – The files to be sampled.
have_header (bool) – Whether the files have a header. If so, the header will be stored as a member variable ‘header’.
- close()¶
- schrodinger.utils.fileutils.touch(path)¶
Touch a path.
- schrodinger.utils.fileutils.gzip_file(infile, outfile, remove_original=False)¶
Creates a new gzip compressed file from the input file.
- Parameters
infile (Path | str) – The input file to be compressed.
outfile (Path | str) – The destination file.
remove_original (bool) – Whether to delete the original file after compression is completed.