schrodinger.gpgpu module¶
schrodinger::gpgpu C++ wrappers
- class schrodinger.gpgpu.CudaDevice¶
schrodinger::gpgpu::CudaDevice Container for device specific query information
- __init__(*args, **kwargs)¶
- computeCapability(self) std::array< int,2 >¶
Retrieves the compute capability for this device. Compute capability is specified as (major version, minor version)
- computeMode(self) std::string¶
Retrieves the current compute mode for the device.
- cores(self) int¶
Retrieves the total number of CUDA device cores
- description(self) std::string¶
String describing the given GPGPU device.
- discountedCores(self) int¶
Retrieves scaled down number of CUDA device core for customer discount. :rtype: int :return: scaled down number of cuda cores if pciDeviceId
(The combined 16-bit device id and 16-bit vendor id) and CUDA core counts match the discounted GPU card/s. Otherwise, it returns actual CUDA device cores.
- get_device_info(self) DeviceInfo¶
Retrieve general device information, such as index, name, UUID, compute mode, and capability.
- get_mig_info(self) MIGInfo¶
Retrieve MIG device information, such as index, GPU and CPU instances, UUID, and memory size. This getter is added for swig. This method must be called after checking mig_enabled() is true, otherwise it will throw. :raises: if device does not support MIG or not enabled
- mig_enabled(self) bool¶
Is MIG enabled for this device or not?
- name(self) std::string¶
Retrieves the ASCII string identifying this device.
- number(self) int¶
Retrieves the CUDA device number.
- uuid(self) std::string¶
Retrieves the globally unique immutable UUID associated with this device.
- class schrodinger.gpgpu.DeviceInfo¶
schrodinger::gpgpu::DeviceInfo Container for general device information
- __init__(*args, **kwargs)¶
- capability¶
- compute_mode¶
- cores¶
- name¶
- number¶
- pci_device_id¶
Combination of Device Id + Vendor Id
- uuid¶
- class schrodinger.gpgpu.MIGInfo¶
schrodinger::gpgpu::MIGInfo Container for MIG device information
- __init__(*args, **kwargs)¶
- cpu_count¶
MIG compute instance slice count
- gpu_count¶
MIG gpu instance slice count
- memory_size_MB¶
MIG memory size in MiB
- number¶
MIG device index
- sm_count¶
Multiprocessor count
- uuid¶
MIG UUID
- class schrodinger.gpgpu.MpsDaemonEnvironmentVars¶
schrodinger::gpgpu::MpsDaemonEnvironmentVars Helper class for managing environment variables values surrounding MPS usage: CUDA_MPS_PIPE_DIRECTORY and CUDA_MPS_LOG_DIRECTORY. See online resources for more info about these variables.
- LOG_DIR_VARIABLE = 'CUDA_MPS_LOG_DIRECTORY'¶
- PIPE_DIR_VARIABLE = 'CUDA_MPS_PIPE_DIRECTORY'¶
- __init__(*args, **kwargs)¶
- getDaemonEnvironment(self) QProcessEnvironment¶
Produce a QProcessEnvironment that contains the normal system environment variables alongside this class’s pipe and log directory.
- getLogDirectory(self) std::string¶
Get the log directory path (errors if hasValue() is false)
- getPipeDirectory(self) std::string¶
Get the pipe directory path (errors if hasValue() is false)
- hasValue(self) bool¶
Whether the getLogDirectory() and getPipeDirectory() will return valid values. Being true indicates either 1) the environment variables were set externally, or 2) this class was constructed with explicit values.
- class schrodinger.gpgpu.MpsDaemonManager¶
schrodinger::gpgpu::MpsDaemonManager Singleton used to manage an NVIDIA MPS control daemon (see online documentation for nvidia-cuda-mps-control for more information). Specifically, initializing this singleton will create a control daemon that enables creating MPS servers for subprocesses to use. In order for subprocesses to obtain access to these MPS servers, the subprocesses must have the CUDA_MPS_PIPE_DIRECTORY/CUDA_MPS_LOG_DIRECTORY environment variables set. To enable setting these variables in subprocesses, utilize the getEnvironmentVars() helper.
This singleton will only create a control daemon if one is not already running and visible to this process. Additionally, it will not create a daemon if there are only default-mode GPUs visible to this process, as there aren’t currently (at the time of writing) Schrodinger use cases that require MPS on default-mode GPUs using this runtime-based MPS daemon approach.
Additionally, note that this functionality is also available in Python. Specifically, see (at the time of writing) gpgpu_utils.py:start_mps_if_needed().
IMPORTANT: Note that this singleton should NOT be used in processes that need to do GPU work and haven’t had a parent process set up MPS for them. Otherwise you will recieve an error.
This class is currently only implemented for linux, and will throw a runtime error if used on other platforms.
- __init__(*args, **kwargs)¶
- getEnvironmentVars(self) MpsDaemonEnvironmentVars¶
Obtain access to the environment variables used to manage visibility into this manager’s mps control daemon.
- static instance()¶
schrodinger::gpgpu::MpsDaemonManager::instance() -> MpsDaemonManager
- schrodinger.gpgpu.MpsDaemonManager_instance() MpsDaemonManager¶
- schrodinger.gpgpu.are_any_exclusive_mode_gpus() bool¶
Determine whether any visible NVIDIA gpus are configured under the EXCLUSIVE PROCESS compute mode
- schrodinger.gpgpu.get_available_devices() std::vector< schrodinger::gpgpu::CudaDevice,std::allocator< schrodinger::gpgpu::CudaDevice > > const &¶
Retrieves device information for the available cuda devices. Note that CUDA runtime examines environment variables once.
- schrodinger.gpgpu.get_desmond_token_count(host) int¶
Gets number of desmond license tokens for a given host. Default value (licensing::DESMOND_GPU_TOKEN_MULTIPLIER) is returned if the host entry lacks CUDA core count.
- Parameters:
host (string) – name of the host entry
- Return type:
int
- Returns:
number of tokens to checkout for (single GPU) at the requested host
- schrodinger.gpgpu.get_device_uuid(pci_bus_id) std::string¶
Retrieves the device UUID using nvml call for given device PCI bus id. :type pci_bus_id: string :param pci_bus_id: string, the PCI bus id of the gpu :raises: runtime_error if nvml initialization, nvmlDeviceGetUUID, or
nvmlDeviceGetHandleByPciBusId call fails.
- Return type:
string
- Returns:
UUID string
- schrodinger.gpgpu.get_minimum_nvml_driver() std::string¶
Get the minimum compatible NVML driver given the CUDA version.
- schrodinger.gpgpu.get_scaled_token_count(host, default_tokens) int¶
Scales the number of default tokens with respect to the number of cores on a V100 card for a given host. If the host does not have a core count defined, default_tokens is returned
- Parameters:
host (string) – name of the host entry
default_tokens (int) – number of tokens a V100 should checkout
- Return type:
int
- Returns:
number of tokens to checkout for the requested host
- schrodinger.gpgpu.in_dev_env() bool¶
Return True if the process is running in a development environment - where SCHRODINGER_SRC or SCHRODINGER_DEV_DEBUG is set.
- schrodinger.gpgpu.is_any_gpu_available() bool¶
Determines if a GPGPU is available for use.
- Return type:
boolean
- Returns:
whether a GPGPU is available on this machine
- schrodinger.gpgpu.print_gpgpu_devices(verbose=True)¶
Writes a summary of available GPGPU devices to stdout.
- Parameters:
verbose (boolean, optional) – whether to print verbose GPGPU descriptions
- schrodinger.gpgpu.verify_any_gpu_available()¶
Verifies that a GPGPU is available for use.
- Raises:
std::runtime_error if no usable GPGPU is available on this machine
- schrodinger.gpgpu.verify_cuda_runtime()¶
Call verify_any_gpu_available with additional CUDA runtime debug information
- Raises:
std::runtime_error if no usable GPGPU is available on this machine
- schrodinger.gpgpu.verify_nvml_driver_supported()¶
Verify if NVML library is loaded successfully and check whether the NVML driver is compatible with the CUDA version. 1. The NVML library is unavailable and we return an empty CudaQuery: we catch it here and throw an runtime error. 2. The NVML library is available, but we have a bad GPU hardware: we will throw runtime error, when we call get_query_singleton(). 3. The NVML library is available and CUDA query is successful: we compare the driver version and throw runtime error if we it does not meet the minimum driver version.
- Raises:
std::runtime_error if NVML library is unavailable, CUDA query failed, or NVML library does not meet the minimum requirement.