nvidia
resource_tracker.nvidia
#
Helpers to monitor NVIDIA GPUs.
Functions:
Name | Description |
---|---|
start_nvidia_smi_pmon |
Start a subprocess to monitor NVIDIA GPUs at the process level using |
process_nvidia_smi_pmon |
Wait for the |
start_nvidia_smi |
Start a subprocess to monitor NVIDIA GPUs' utilization and memory usage using |
process_nvidia_smi |
Wait for the |
start_nvidia_smi_pmon
#
Start a subprocess to monitor NVIDIA GPUs at the process level using nvidia-smi pmon
.
Note that nvidia-smi pmon
is limited to monitoring max. 4 GPUs.
Returns:
Type | Description |
---|---|
Optional[Popen]
|
The subprocess object or None if nvidia-smi is not installed. |
Source code in resource_tracker/nvidia.py
process_nvidia_smi_pmon
#
Wait for the nvidia-smi pmon
subprocess to finish and process the output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
nvidia_process
|
Optional[Popen]
|
The subprocess object to monitor or None if not started.
Returned by |
required |
pids
|
Optional[Set[int]]
|
A set of process IDs to monitor. If None, all processes are monitored. |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Union[int, float, Set[int]]]
|
A dictionary of GPU stats:
|
Source code in resource_tracker/nvidia.py
start_nvidia_smi
#
Start a subprocess to monitor NVIDIA GPUs' utilization and memory usage using nvidia-smi
.
Returns:
Type | Description |
---|---|
Optional[Popen]
|
The subprocess object or None if nvidia-smi is not installed. |
Source code in resource_tracker/nvidia.py
process_nvidia_smi
#
Wait for the nvidia-smi
subprocess to finish and process the output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
nvidia_process
|
Optional[Popen]
|
The subprocess object to monitor or None if not started.
Returned by |
required |
Returns:
Type | Description |
---|---|
Dict[str, Union[int, float]]
|
A dictionary of GPU stats:
|