Configure queuing system license checking with flexible GPU licensing

In order for the queuing system based Setting Up License Checking for Queueing Systems mechanism to reserve the correct number of licenses each GPU host entry must be configured to submit to one specific GPU type and have a field that specifies the number of cuda compute cores available on a single one of those GPU types. (See Table below)

In most cases this will just mean adding a single line to your existing gpu schrodinger.hosts file entries:

cuda_cores: [enter number here]

For example, a host entry for a Slurm queue submitting a node with 4 V100 GPUs would look like this:

name: slurm-v100-gpu
host: headnode.mycluster.com
schrodinger: /opt/bin/schrodinger2022-2
queue: SLURM2.1
qargs: --partition=SLURM-partition --ntasks=%NPROC% --gres=gpu:type:%NPROC%
tmpdir: /usr/local/tmp
gpgpu: 0, Tesla V100
gpgpu: 1, Tesla V100
gpgpu: 2, Tesla V100
gpgpu: 3, Tesla V100
cuda_cores: 5120

How to find the 'cuda_cores' values?

First, log into the GPU computer node.
Then, run the following command:
```
$SCHRODINGER/utilities/query_gpgpu -a
```
Starting from the 2024-2 release, you can use the value given by the "Discounted Cores" as the “cuda_cores” value.

Before the 2024-2 release, please use the "Total Cores" as the “cuda_cores” value.

cuda_cores values for our supported GPUs		cuda_cores values for A100 MIG GPUs
Nvidia GPU	cuda_cores	Nvidia GPU	cuda_cores
Tesla M40	3072	Tesla A100 1g	896
Tesla M60	2048	Tesla A100 2g	1792
Tesla P100	3584	Tesla A100 3g	2688
Tesla P40	3840	Tesla A100 7g	6272
Tesla V100	5120	Tesla A100	6912
Tesla T4	2560	Tesla H100 1g	1792
Tesla A100	6912	Tesla H100 2g	3840
Quadro RTX 5000	3072	Tesla H100 3g	5888
Quadro RTX A5000	8192	Tesla H100 4g	7936
Quadro RTX A4000	6144	Tesla H100 7g	14592
Tesla L4 (before 2024-2)*	7424	Tesla H100	14592
Tesla L4 (2024-2)*	5120

* In the 2024-2 release, the "Discounted Cores" of an L4 was decreased from 7424 to 5120. This adjustment was made to align with the L4's status as a preferred card due to its widespread availability on-premises and through cloud providers. Additionally, the L4's low power consumption, speed, and sufficient memory make it suitable for running the majority of workflows in a configuration with one GPU per node.