schrodinger.application.scaffold_enumeration.cxsmiles module¶
Functions to parse “repeating units” and “position variant bonds” from CX SMILES “features” text are not particularly bright, but probably good enough for machine-generated CX SMILES.
- class schrodinger.application.scaffold_enumeration.cxsmiles.MCG(atoms, center)¶
- Bases: - tuple- atoms¶
- List of atom indices ([int]). 
 - center¶
- Central atom index (int). 
 
- class schrodinger.application.scaffold_enumeration.cxsmiles.SRU(atoms, subscript, superscript)¶
- Bases: - tuple- atoms¶
- List of atom indices ([int]). 
 - subscript¶
- SRU’s subscript (str). 
 - superscript¶
- SRU’s superscript (str). 
 
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_mcg(text, pos, accum)¶
- Parses “multi-center SGroup” data from CX SMILES “features”. - <quote> - The multicenter atom indexes written after “m:” followed by a colon character and the indexes of the atoms which forms the given SGroup separated by “.”. The SGroups are separated by commas. - Example: “m:0:7.6.5.4.3,2:12.11.10.9.8,C:0.0,2.1” - </quote> - Parameters
- text (str) – CX SMILES “features” string. 
- pos (int) – Index of the character in - textright after “m:”.
- accum (list) – List to which the “SGroups” are to be appended. 
 
- Returns
- Index of the first unconsumed character in - text.
- Return type
- int 
 
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_sru(text, pos, accum)¶
- Parses “SRU” data from CX SMILES “features”. - <quote> - Polymer Sgroups Each Sgroup exported after “Sg:” in fields separated by a colon. Fields are: - Sgroup type keyword. Valid keywords are: 
 - Keyword | Sgroup Type n | SRU … | … - Atom indexes separated with commas. 
- Subscript of the Sgroup. If the supscript equals the keyword of the Sgroup this field can be empty. Escaped field. 
- Superscript of the Sgroup. In the superscript only connectivity and flip information is allowed. This field can be empty. Escaped field. 
- Head crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty. 
- Tail crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty. 
- If the c export option is present then bracket orientation, bracket type followed by the coordinates (4 pair, separated with commas). Bracket orientation can be s or d (single or double), bracket type can be b,c,r,s for braces, chevrons, round and square, respectively. The brackets are written between parentheses and separated with semicolons. 
 - A colon is needed after the last non-empty field. - If one needs to retain not only the chemically relevant information, but the whole structure (as drawn), then the c export option should be used. - Examples: - CCCC |Sg:gen:0,1,2:| CCCC |Sg:n:0,1,2:3-6:eu| *CC(*)C(*)N* |Sg:n:6,1,2,4::hh,f:6,0,:4,2,| - </quote> - In addition: - <quote> - Escaping - In some places special characters are escaped to ‘&#code’ where code is the ASCII code of the special character. - Not escaped characters in fields of Sgroups and DataSgroups: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character. - Not escaped characters in atom property keys and values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character. - Not escaped characters in atom labels and atom values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#%()[]./?-+*^_~=,:’ and the space character. - </quote> - This subroutine recognizes only:
- atoms (2), subscript (3), and superscript (4). 
 - Parameters
- text (str) – CX SMILES “features” string. 
- pos (int) – Index of the character in - textright after “Sg:n:”.
- accum (list) – List to which the “SGroups” are to be appended. 
 
- Returns
- Index of the first unconsumed character in - text.
- Return type
- int 
 
- schrodinger.application.scaffold_enumeration.cxsmiles.parse_cx_extensions(text)¶
- Parses: (a) multi-center groups and (b) SRUs. 
- schrodinger.application.scaffold_enumeration.cxsmiles.mol_from_cxsmiles(text, parseName=True)¶
- Strives to instantiate - rdkit.Chem.Molfrom- textassuming that the latter is CX SMILES.- Parameters
- text (str) – CX SMILES string. 
- parseName (bool) – Parse molecule title? 
 
- Returns
- Molecule or None 
- Return type
- rdkit.Chem.Mol or NoneType