schrodinger.application.scaffold_enumeration.cxsmiles module¶
Functions to parse “repeating units” and “position variant bonds” from CX SMILES “features” text are not particularly bright, but probably good enough for machine-generated CX SMILES.
- class schrodinger.application.scaffold_enumeration.cxsmiles.MCG(atoms, center)¶
Bases:
tuple
- atoms¶
List of atom indices ([int]).
- center¶
Central atom index (int).
- class schrodinger.application.scaffold_enumeration.cxsmiles.SRU(atoms, subscript, superscript)¶
Bases:
tuple
- atoms¶
List of atom indices ([int]).
- subscript¶
SRU’s subscript (str).
- superscript¶
SRU’s superscript (str).
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_mcg(text, pos, accum)¶
Parses “multi-center SGroup” data from CX SMILES “features”.
<quote>
The multicenter atom indexes written after “m:” followed by a colon character and the indexes of the atoms which forms the given SGroup separated by “.”. The SGroups are separated by commas.
Example: “m:0:7.6.5.4.3,2:12.11.10.9.8,C:0.0,2.1”
</quote>
- Parameters
text (str) – CX SMILES “features” string.
pos (int) – Index of the character in
text
right after “m:”.accum (list) – List to which the “SGroups” are to be appended.
- Returns
Index of the first unconsumed character in
text
.- Return type
int
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_sru(text, pos, accum)¶
Parses “SRU” data from CX SMILES “features”.
<quote>
Polymer Sgroups Each Sgroup exported after “Sg:” in fields separated by a colon. Fields are:
Sgroup type keyword. Valid keywords are:
Keyword | Sgroup Type n | SRU … | …
Atom indexes separated with commas.
Subscript of the Sgroup. If the supscript equals the keyword of the Sgroup this field can be empty. Escaped field.
Superscript of the Sgroup. In the superscript only connectivity and flip information is allowed. This field can be empty. Escaped field.
Head crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty.
Tail crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty.
If the c export option is present then bracket orientation, bracket type followed by the coordinates (4 pair, separated with commas). Bracket orientation can be s or d (single or double), bracket type can be b,c,r,s for braces, chevrons, round and square, respectively. The brackets are written between parentheses and separated with semicolons.
A colon is needed after the last non-empty field.
If one needs to retain not only the chemically relevant information, but the whole structure (as drawn), then the c export option should be used.
Examples:
CCCC |Sg:gen:0,1,2:| CCCC |Sg:n:0,1,2:3-6:eu| *CC(*)C(*)N* |Sg:n:6,1,2,4::hh,f:6,0,:4,2,|
</quote>
In addition:
<quote>
Escaping
In some places special characters are escaped to ‘&#code’ where code is the ASCII code of the special character.
Not escaped characters in fields of Sgroups and DataSgroups: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character.
Not escaped characters in atom property keys and values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character.
Not escaped characters in atom labels and atom values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#%()[]./?-+*^_~=,:’ and the space character.
</quote>
- This subroutine recognizes only:
atoms (2), subscript (3), and superscript (4).
- Parameters
text (str) – CX SMILES “features” string.
pos (int) – Index of the character in
text
right after “Sg:n:”.accum (list) – List to which the “SGroups” are to be appended.
- Returns
Index of the first unconsumed character in
text
.- Return type
int
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_cx_extensions(text)¶
Parses: (a) multi-center groups and (b) SRUs.
- schrodinger.application.scaffold_enumeration.cxsmiles.mol_from_cxsmiles(text, parseName=True)¶
Strives to instantiate
rdkit.Chem.Mol
fromtext
assuming that the latter is CX SMILES.- Parameters
text (str) – CX SMILES string.
parseName (bool) – Parse molecule title?
- Returns
Molecule or None
- Return type
rdkit.Chem.Mol or NoneType