Deep Learning - Setup GPU for tensorflow


The following explains how to setup tensorflow GPU environment on 64-bit Ubuntu 17.04 Linux.
Following the steps, I will show how to install Nvidia drivers, Nvidia toolkit, CudNN and tensorflow.

1. Nvidia driver

Install Nvidia driver for graphic card GeForce GTX 1070.

1.1. Download driver

First of all, you need to choose specific version and download Nvidia drivers from Nvidia home page.
Here I choose version of 384.59:

1.2. Install dependency

Second, you need to install dependency environment for Nvidia drivers:

sudo apt-get update
sudo apt-get install dkms build-essential linux-headers-generic

Add nouveau into black list:

sudo mkdir /etc/modprobe.d/
sudo touch /etc/modprobe.d/blacklist-nouveau.conf
cat >> /etc/modprobe.d/blacklist-nouveau.conf << EOF
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

Run it to forbiddent nouveau:

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u

Then restart your ubuntu:

sudo reboot

1.3. Install Nvidia drivers

Before installing, you need to stop ubuntu gui:

sudo service lightdm stop

OK, install it and reboot your machine:

sudo chmod u+x
sudo ./
sudo reboot

After restart ubuntu, we can test it:

➜  $ nvidia-smi
Tue Aug  8 22:40:47 2017
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  GeForce GTX 1070    Off  | 0000:01:00.0     Off |                  N/A |
|  0%   41C    P0    35W / 180W |      0MiB /  8112MiB |      3%      Default |

| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|  No running processes found                                                 |

OK, this result suggests our Nvidia drivers installed successfully.

2. Nvidia toolkit & cuDNN

Nvidia toolkit is a base sdk for cuda, and you can download it from:
For the more, we need CudNN for tensorflow deep learning, and download it from:

2.1 Download cuda

Here I choose local deb package cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb:
cuda installation version

2.2 Install cuda

Following cuda installation command lines:

sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/
sudo apt-get update
sudo apt-get install cuda -y

Update cuda environment variables by appending conf after ~/.profile:

ln -vfs $CUDA_HOME /usr/local/cuda
cat >> ~/.profile << EOF
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64 
source ~/.profile

2.3 Testing cuda

Cuda package include a lot testing sample, just make a test for installation, this is my terminal returns:

➜ $ cd $CUDA_HOME/samples/1_Utilities/deviceQuery
➜ $ make 
➜ $ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1070"

OK, this result suggests our Nvidia toolkit installed successfully.

2.4 Download CudNN

At first, I choose to download lastest pckage of CudNN: cuDNN v7.0 (August 3, 2017), for CUDA 9.0 RC,
but I find it doesn’t work at all, my tensorflow can’t find my gpu,
then I try cuDNN v5.1 (Jan 20, 2017), for CUDA 8.0 instead.

CudNN version

2.5 Install CudNN

To install CudNN, download cudNN v5.0 for Cuda 8.0 from Nvidia website and extract into /usr/local/cuda via:

tar xzvf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

3. Tensorflow

After checking Nvidia drivers, Nvidia toolkit, CudNN installed successfully,
we now start to install tensorflow-gpu, guide from:

3.1. Install dependency

Before installation, make sure everything is ok for ubuntu python dependency:

# From:

sudo apt-get install python-dev python3-dev python-numpy \
 python3-numpy python-six python3-six build-essential python-pip python3-pip -y

# Adds NVIDIA package repository.
sudo apt-key adv --fetch-keys
sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo dpkg -i nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt-get update

# Includes optional NCCL 2.x.
sudo apt-get install cuda9.0 cuda-cublas-9-0 cuda-cufft-9-0 cuda-curand-9-0 \
  cuda-cusolver-9-0 cuda-cusparse-9-0 libcudnn7= \
   libnccl2=2.2.13-1+cuda9.0 cuda-command-line-tools-9-0

# Optionally install TensorRT runtime, must be done after above cuda install.
sudo apt-get update
sudo apt-get install libnvinfer4=4.1.2-1+cuda9.0

3.2. Install tensorflow

sudo pip install tensorflow tensorflow-gpu

3.3 Testing tensorflow

Let’s write a sample code for tensorflow with gpu:

# Creates a graph.
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True))
# Runs the op.

You should see the following output:

Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

Example from:

4. Reference

本文出自 夏日小草,转载请注明出处:

by grasses 2017.08.08

Fork me on GitHub