How to Install the CUDA Driver for TensorFlow (installing from source)

1.5k views Asked by At

I'm trying to build TensorFlow from source and run it with GPU support. To install the toolkit I use the runfile, to install the driver I used the Additional Drivers Tool, since I did not get Ubuntu to boot into Text mode as specified in the CUDA documentation and stop lightdm and start lightdm does not work either, it gives me (also with sudo):

Name com.ubuntu.Upstart does not exist

So far I could build a release from the TensorFlow repository. However, when I'm trying to run the example as specified in the how-to

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

the GPU apparently cannot be found:

jonas@jonas-Aspire-V5-591G:~/Documents/repos/tensoflow_fork$ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: jonas-Aspire-V5-591G
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: jonas-Aspire-V5-591G
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: 352.63.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:356] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  352.63  Sat Nov  7 21:25:42 PST 2015 GCC version:  gcc version
    4.9.2 (Ubuntu 4.9.2-10ubuntu13)  """
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 352.63.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:293] kernel version seems to match DSO: 352.63.0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine.
F tensorflow/cc/tutorials/example_trainer.cc:125] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'y': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: y = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/gpu:0"](Const, x)]])
Aborted

I'm using a clean Ubuntu 15.04 installation on an Acer Notebook with the GTX950M.

Can anybody tell me how to properly install the driver?

1

There are 1 answers

1
Yaroslav Bulatov On

Can you run deviceQuery (comes with cuda installation)? Can you see nvidia present in lspci/lsmod/nvidia-smi?

lsmod |grep nvidia 
dmesg | grep -i nvidia
lspci | grep -i nvidia
nvidia-smi

You can reload nvidia module and look for error messages

modprobe -r nvidia
dmesg | tail
sudo dmesg | grep NVRM

Related issue https://github.com/tensorflow/tensorflow/issues/601