pydata

Keep Looking, Don't Settle

2019-07-20 Week 29, nvidia-smi error

nvidia-smi error

The error is (tensorflow_p36) ubuntu@ip-172-31-36-1:~$ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

This problem can be resolved by installing the (currently) latest version of the Nvidia driver.

  1. create an install file vim install.sh
#!/bin/bash

set -x

version=$1
#version=410.79
#version=410.104

wget http://us.download.nvidia.com/tesla/${version}/NVIDIA-Linux-x86_64-${version}.run 
sudo sh ./NVIDIA-Linux-x86_64-${version}.run --no-drm --disable-nouveau --dkms --silent --install-libglvnd 
  1. sudo ./install.sh 410.104

  2. modinfo nvidia | head -7 filename: /lib/modules/4.4.0-1077-aws/updates/dkms/nvidia.ko alias: char-major-195- version: 410.104 supported: external license: NVIDIA srcversion: 3B812B02678A6B43A294F17 alias: pci:v000010DEd00000E00svsdbc04sc80i00

  3. sudo modprobe nvidia

  4. nvidia-smi

Reference

  1. NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA drive