前言

很多时候配置深度学习的环境都会遇到这样一个问题,就是参考的不同的开源代码所用的环境不一定相同,特别是CUDA环境,一般会有CUDA9.0、CUDA10.0、CUDA10.1等版本。所对应的cuDNN也会不同。本文是在已安装CUDA10.0+cudnn7.6.4的基础上,加装CUDA9.0+cudnn7.3.1。

一、gcc降级

由于CUDA 9.0仅支持gcc6.0及以下版本,而Ubuntu 18.04预装GCC版本为7.3,因此需要手动进行降级,这里采用4.8版本。

sudo apt-get install gcc-4.8 
sudo apt-get install g++-4.8

装完后进入到/usr/bin目录下

ls -l gcc*

会显示/usr/bin/gcc -> gcc-7.0,发现gcc链接到gcc-7.0, 需要将它改为链接到gcc-4.8,方法如下

sudo mv gcc gcc.bak    #备份 
sudo ln -s gcc-4.8 gcc    #重新链接

同理,对g++也做同样的修改,需要将g++链接改为g+±4.8,

sudo mv g++ g++.bak 
sudo ln -s g++-4.8 g++

再查看gcc和g++版本号

gcc -v 
g++ -v

均显示gcc version 4.8 ,说明gcc 4.8安装成功。

二、安装CUDA9.0

先到NVIDIA官网CUDA9.0下载页面下载runfile,选择ubuntu16.04。(注:18.04版本的系统能够安装16.04版本对应的CUDA)

文件下载后用以下指令安装

sudo chmod a+x cuda_9.0.176_384.81_linux.run
sudo ./cuda_9.0.176_384.81_linux.run --no-opengl-libs

由于之前装CUDA10.0前已经安装了显卡驱动,所以在提问是否安装显卡驱动时选择no,其他的选择默认路径或者yes即可。

----------------------------------------------------------------------------------
Do you accept the previously read EULA?
accept/decline/quit: accept    # 接受CUDA安装的协议
 
You are attempting to install on an unsupported configuration. Do you wish to continue?
(y)es/(n)o [ default is no ]: y 
 
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
(y)es/(n)o/(q)uit: n    # 由于已经安装显卡驱动,选择n
 
Install the CUDA 9.0 Toolkit?
(y)es/(n)o/(q)uit: y
 
Enter Toolkit Location
 [ default is /usr/local/cuda-9.0 ]:    # 工具包安装地址,默认回车即可
 
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: n    # 添加链接**注意这个连接,因为安装过另一个版本的cuda10.0,这里就建议选no,因为指定该链接后会将cuda指向这个新的版本**
 
Install the CUDA 9.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /home/ubuntu ]:

Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Missing recommended library: libGL.so

Installing the CUDA Samples in /home/ubuntu ...
Copying samples to /home/ubuntu/NVIDIA_CUDA-9.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-9.0
Samples:  Installed in /home/ubuntu, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-9.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_14444.log
 
# ***至此安装完成***

安装完毕之后,修改环境变量

sudo vim /etc/profile

.bashrc中添加以下路径后source ~/.bashrc

export PATH=/usr/local/cuda-9.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME="/usr/local/cuda-9.0:$CUDA_HOME"

或者

推荐采用】直接指定软链接后的(便于后续CUDA版本切换,只需要重建CUDA软链接而不用改路径):

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME="/usr/local/cuda:$CUDA_HOME"

三、安装cuDNN

下载对应的cudnn,如cudnn-9.0-linux-x64-v7.3.1.20.tgz

解压

tar -zxvf cudnn-9.0-linux-x64-v7.3.1.20.tgz

将相关文件复制到CUDA路径

sudo cp cuda/include/cudnn.h /usr/local/cuda-9.0/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/
sudo chmod a+r /usr/local/cuda-9.0/include/cudnn.h
sudo chmod a+r /usr/local/cuda-9.0/lib64/libcudnn*

四、CUDA版本选择

(1)将CUDA软链接到新安装的CUDA-9.0,

cd /usr/local
sudo rm -rf cuda #删除之前创建的软链接
sudo ln -s cuda-9.0 cuda #重建软链接

查看当前 cuda 版本

nvcc -V
nvcc: NVIDIA ® Cuda compiler driver Copyright © 2005-2017 NVIDIA
 Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation
 tools, release 9.0, V9.0.176

查看cudnn信息

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

cuda怎么使用多个gpu_CUDA


(2)切换到CUDA10.0

sudo rm -rf /usr/local/cuda #删除之前创建的软链接
sudo ln -s /usr/local/cuda-10.0/ /usr/local/cuda
nvcc -V #查看当前 cuda 版本

五、后记

(1)CUDA9.0安装时还有4个补丁包,可以视情况安装。主要是针对cuBLAS的更新。当然既然官方给了补丁包,肯定是安装最稳妥了。

cuda怎么使用多个gpu_cuda怎么使用多个gpu_02


(2)cuDNN9.0下载时还有3个附加包,可以视情况选择安装。

cuDNN v7.3.1 Runtime Library for Ubuntu16.04 (Deb)
cuDNN v7.3.1 Developer Library for Ubuntu16.04 (Deb)
cuDNN v7.3.1 Code Samples and User Guide for Ubuntu16.04 (Deb)

(3)导入TensorFlow时报错ImportError: /usr/local/cuda-9.0/lib64/libcudnn.so.7: file too short 是动态库链接出了问题,
首先进入到/usr/local/cuda/lib64下执行rm libcudnn.so.7 libcudnn.so.7.3.1
然后切换到下载的cudnn7.3.1.20(该压缩包解压后名称为cuda)目录执行cp libcudnn.so.7.3.1 /usr/local/cuda/lib64
最后切回 /usr/local/cuda/lib64 再执行ln -s libcudnnn.so.7.3.1 libcudnn.so.7

cuda怎么使用多个gpu_cuda怎么使用多个gpu_03