I am trying to run nvidia inference server through docker I got the correct Image of triton server from docker
but when docker logs sample-tis-22.04 --tail 40
It shows this :
I0610 15:59:37.597914 1 server.cc:576]
+-------------+-------------------------------------------------------------------------+--------+
| Backend | Path | Config |
+-------------+-------------------------------------------------------------------------+--------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorflow | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
| openvino | /opt/tritonserver/backends/openvino_2021_4/libtriton_openvino_2021_4.so | {} |
+-------------+-------------------------------------------------------------------------+--------+
I0610 15:59:37.597933 1 server.cc:619]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+
W0610 15:59:37.635981 1 metrics.cc:634] Cannot get CUDA device count, GPU metrics will not be available
I0610 15:59:37.636226 1 tritonserver.cc:2123]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value
|
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton
|
| server_version | 2.21.0
|
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /models
|
| model_control_mode | MODE_NONE
|
| strict_model_config | 1
|
| rate_limit | OFF
|
| pinned_memory_pool_byte_size | 268435456
|
| response_cache_byte_size | 0
|
| min_supported_compute_capability | 6.0
|
| strict_readiness | 1
|
| exit_timeout | 30
|
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0610 15:59:37.638384 1 grpc_server.cc:4544] Started GRPCInferenceService at 0.0.0.0:8001
I0610 15:59:37.638908 1 http_server.cc:3242] Started HTTPService at 0.0.0.0:8000
I0610 15:59:37.680861 1 http_server.cc:180] Started Metrics Service at 0.0.0.0:8002
(nvdiaTritonServer_env) E:\Github\triton_server_ImageModel>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:30:10_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0
(nvdiaTritonServer_env) E:\Github\triton_server_ImageModel>nvidia-smi
Mon Jun 10 21:17:32 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.85 Driver Version: 555.85 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 WDDM | 00000000:05:00.0 On | N/A |
| 0% 49C P8 9W / 170W | 736MiB / 12288MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1280 C+G ...__8wekyb3d8bbwe\WindowsTerminal.exe N/A |
| 0 N/A N/A 2568 C+G ...siveControlPanel\SystemSettings.exe N/A |
| 0 N/A N/A 2780 C+G ...\Docker\frontend\Docker Desktop.exe N/A |
| 0 N/A N/A 5840 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 6212 C+G ...al\Discord\app-1.0.9047\Discord.exe N/A |
| 0 N/A N/A 7148 C+G ...t.LockApp_cw5n1h2txyewy\LockApp.exe N/A |
| 0 N/A N/A 7824 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 0 N/A N/A 8068 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 0 N/A N/A 10332 C+G ...on\125.0.2535.92\msedgewebview2.exe N/A |
| 0 N/A N/A 10972 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 N/A N/A 13484 C+G ...GeForce Experience\NVIDIA Share.exe N/A |
| 0 N/A N/A 13712 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 18732 C+G ....0_x64__8wekyb3d8bbwe\HxOutlook.exe N/A |
| 0 N/A N/A 19024 C+G ...7.0_x64__cv1g1gvanyjgm\WhatsApp.exe N/A |
+-----------------------------------------------------------------------------------------+
-- I am using it in anaconda env , I have properly installed cuda and cudnn and also check that nvcc --version is working correctly and outputing
but the log says metric cant be used and model, version and status all are empty despite having correct path .
Solved this issue: My GPU is rtx3060 And nvidia-smi output current driver version
NVIDIA-SMI 555.85 Driver Version: 555.85 CUDA Version: 12.5
My Docker version :
Current version: 4.30.0 (149282)
is not supporting the driver version 555.65 and cuda 12.5
So Downgrade the NVIDIA Driver to 552.22 and cuda 12.4
by downloading the drive from www.nvidia.com/download/driverResults.aspx/224154/en-us/
Clean Install Only
reboot the system
then run the docker compose , GPU metric and device will be detected by the docker