Active Learning Glide
System Requirements
Supported Operating Systems
Linux
-
RedHat Enterprise Linux (RHEL) 8.8, 9.2
Please make sure the listed packages are installed:
sudo yum/dnf install <lib>
-
Rocky Linux 8.8, 9.2
Please make sure the listed packages are installed:
sudo yum/dnf install <lib>
-
Ubuntu 20.04 LTS and 22.04 LTS
Please make sure the listed packages are installed:
sudo apt-get install <lib>
Timeline
We aim to provide support for new operating system versions
Support cannot be provided once an OS platform version has reached "end of life" (EOL). Check with your platform provider for EOL information.
Upcoming Changes
Beginning in 2024-3, the Schrödinger software release will use Schrödinger License Manager by default. See Schrödinger License Manager Instructions.
Existing FlexNet licensing can be used with the 2024-3 release by changing the settings. See FlexNet License Instructions for 2024-3.
To view a list of recent infrastructure changes that may require changes from your IT team click here.
Hardware Requirements
Required |
|
|
Driver |
The driver (master job) must run for the complete duration of the job without being interrupted. This means the computing resource on which it runs cannot be a spot or preemptible cloud instance. These nodes can be pre-empted (terminated) and if that happens your whole job will be lost. The -DRIVERHOST argument determines where the driver runs. Select a host entry that is for an on-demand (i.e. not preemptible) node type. |
If sufficient licenses and computational resources are available to run multiple Active Learning Glide jobs simultaneously, it is recommended to configure the driver host entry so that it requests an entire node, to avoid multiple drivers potentially using the same node and scratch filesystem, and thereby doubling (or more) the space requirement. |
Processor (CPU) |
x86_64 compatible processor |
For large jobs, computing on a cluster with a queueing system is recommended, with the following hardware components:
|
System memory (RAM) |
64 GB memory for the entire node |
RAM is not related to the input file size, only the disk space is related. |
Disk space |
|
Scratch space example
Scratch requirements for an example are provided in red. All parameters are consistent with our recommendations for an ultra-large screen with AL-Glide.
For larger input files, please substitute the size of the input file to obtain correct estimates for your jobs.
Example of requirements based on inputs
Inputs for example:
1 billion drug-like ligands in SMILES format (100GB)
3 iterations of active-learning (-iter 3)
Batch training size of ligands. (-train_size 50000)
The top ligands after each iteration retained is 100M
Rescoring of the top 1M ligands with Glide SP (-num_rescore_ligands 1000000)
Write output poses in Maestro format for the rescored ligands (-write_pose)
Required Optional a single copy of the input file 100 GB The input ligand file split into individual sub-job input batches 100 GB
Series of csv files containing the predictions of the top 10% of each batch (sorted by uncertain). They are used to select input ligands for each iteration of training. 30 GB Series of csv files containing the ligand_ml predictions for the ligands in all the batches 100GNot B*num_iteration
An output file for each iteration of training containing the predictions of the number of top-scoring compounds specified by the -keep command-line argument 30 GB If -num_rescore_ligand is specified, a single csv file containing the top rescored poses with Glide SP compounds as specified by num_rescore_ligand. (200 MB) If -write_pose is provided, a Maestro file containing the poses of the rescored ligands. (2 GB) Total disk space required 620 G(100 + 100 + 130*3 + 30) 3 iterations 622.2 G (100 + 100 + 130*3 + 30 + 0.2 + 2)
Subjob Requirements
Requirements for memory, disk space, and recommended Google Cloud instance type are listed below.
All values based on the example workflow described above.
|
ML Training* |
ML Evaluation |
Glide Docking |
Scratch Space | 200GB | 100GB | 100GB |
Memory | 64GB (8 GB/CPU core) | 32GB (4 GB/CPU core) | 32GB (4 GB/CPU core) |
Compatible with Preemptible Nodes | No | Yes | Yes |
Recommended GCP Node Type | n1-highmem-8 | n2-standard-8 | n2-standard-8 |
* Nvidia Tesla T4 GPUs recommended.
GPGPU Requirements
(General-purpose computing on graphics processing units)
We support the following NVIDIA solutions:
Architecture | Server / HPC | Workstation |
Pascal |
Tesla P40 Tesla P100 |
Quadro P5000 |
Volta |
Tesla V100 |
|
Turing |
Tesla T4 |
Quadro RTX 5000 |
Ampere |
Tesla A100 |
RTX A4000 RTX A5000 |
Ada Lovelace |
L4 |
|
Hopper |
H100 |
|
Deprecated
-
Support for the Tesla M40 and M60 cards is deprecated.
Notes
-
We support only the NVIDIA 'recommended / certified / production branch' Linux drivers for these cards with minimum CUDA version 12.0.
-
For information on pre-configured Schrödinger compatible GPU boxes see MD Compatible Systems and FEP+ Compatible Systems.
-
Standard support does not cover consumer-level GPU cards such as GeForce GTX cards.
-
In the 2024-2 release, the "Discounted Cores" of an L4 was decreased from 7424 to 5120. This adjustment was made to align with the L4's status as a preferred card due to its widespread availability on-premises and through cloud providers. Additionally, the L4's low power consumption, speed, and sufficient memory make it suitable for running the majority of workflows in a configuration with one GPU per node.
-
If you already have another NVIDIA GPGPU and would like to know if we have experience with it, please contact our support at help@schrodinger.com.