Configure queuing system license checking with flexible GPU licensing

 

In order for the queuing system based Setting Up License Checking for Queueing Systems mechanism to reserve the correct number of licenses each GPU host entry must be configured to submit to one specific GPU type and have a field that specifies the number of cuda compute cores available on a single one of those GPU types. (See Table below)

In most cases this will just mean adding a single line to your existing gpu schrodinger.hosts file entries:

cuda_cores: [enter number here]

For example, a host entry for a Slurm queue submitting a node with 4 V100 GPUs would look like this:

name: slurm-v100-gpu
host: headnode.mycluster.com
schrodinger: /opt/bin/schrodinger2022-2
queue: SLURM2.1
qargs: --partition=SLURM-partition --ntasks=%NPROC% --gres=gpu:type:%NPROC%
tmpdir: /usr/local/tmp
gpgpu: 0, Tesla V100
gpgpu: 1, Tesla V100
gpgpu: 2, Tesla V100
gpgpu: 3, Tesla V100
cuda_cores: 5120

How to find the 'cuda_cores' values?

  1. First, log into the GPU computer node.
  2. Then, run the following command:

    $SCHRODINGER/utilities/query_gpgpu -a

    Starting from the 2024-2 release, you can use the value given by the "Discounted Cores" as the “cuda_cores” value.

    Before the 2024-2 release, please use the "Total Cores" as the “cuda_cores” value.

 

cuda_cores values for our supported GPUs

 

cuda_cores values for A100 MIG GPUs

 
  Nvidia GPU cuda_cores   Nvidia GPU cuda_cores  
  Tesla M40 3072   Tesla A100 1g 896  
  Tesla M60 2048   Tesla A100 2g 1792  
  Tesla P100 3584   Tesla A100 3g 2688  
  Tesla P40 3840   Tesla A100 7g 6272  
  Tesla V100 5120   Tesla A100 6912  
  Tesla T4 2560   Tesla H100 1g 1792  
  Tesla A100 6912   Tesla H100 2g 3840  
  Quadro RTX 5000 3072   Tesla H100 3g 5888  
  Quadro RTX A5000 8192   Tesla H100 4g 7936  
  Quadro RTX A4000 6144   Tesla H100 7g 14592  
  Tesla L4 (before 2024-2)* 7424   Tesla H100 14592  
  Tesla L4 (2024-2)* 5120        

* In the 2024-2 release, the "Discounted Cores" of an L4 was decreased from 7424 to 5120. This adjustment was made to align with the L4's status as a preferred card due to its widespread availability on-premises and through cloud providers. Additionally, the L4's low power consumption, speed, and sufficient memory make it suitable for running the majority of workflows in a configuration with one GPU per node.