I've just purchased a new Pascal video card, in part for deep learning applications. I wanted to be able to use the card in a VM (because this all runs on my one and only server along with lots of other stuff including windows games)
BUT I didn't want the card to attach to my monitor (I've run out of ports on the monitor on my desk), I want to address the vm using Spice as "normal" for a vm
Initially I tried installing from the repositories, but that just resulted in blank screens ... not very helpful, so experimented until i could get it to work
These 2 links were key for me .. plus some other general investigation and experimentation
http://docs.nvidia.com/cuda/cuda-ins...nux/index.html
https://www.pugetsystems.com/labs/hp...ascal-GPU-825/
This assumes you are able to create a VM and pass a video card to it. In my case I created a kubuntu vm using virt-manager (libvirt), WITHOUT passing the video card initially.
I used the q35 motherboard and UEFI bios ... which meant the VM needed to be modified BEFORE the install ( extra check-box on last dialog of virt-manager create)
Then, once the vm was installed, I shut it down and used virsh edit to go in and change the xml.
Add this at the top (replace the existing first line), it allows you to send arguments directly to qemu, which we need because Ubuntu's version of libvirt is just a little too old
Then add these lines just before the </domain> tag at the end
Note that the "hv_" parameters are Windows specific "enlightenments" ... I just add them for consistency when creating vms for nvidia gpu passthrough. You only NEED the "value=....,kvm=off,vendor_id=some4321text" not the "hv_" elements.
Now you can attach the nVidia video card and restart the vm. The above qemu lines hide the hypervisor from the nvidia driver and add some windows performance optimisations (see above).
Once you have a VM, you've ensured the packages are up to date and it's running with the video card attached you'll need to blacklist the nouveau driver by creating a new file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents
The rebuild intiramfs by executing the following
Now reboot so the nouveau driver is not loaded, and check after reboot with
Download the latest nVidia driver from here https://www.nvidia.com/Download/index.aspx?lang=en-us. I saved the driver "run" file to ~/Downloads/cudaNOTE that you will want to download the "run" version NOT the deb files !!
FROM THIS POINT FORWARD IT'S BEST TO SHUT DOWN THE WINDOW MANAGER (the gui) SO THERE ARE NO CONFLICTS. If using virt-manager gui there are menu options to send <ctl><alt><f2> etc., so do that to get to a command line and then execute one of the following (depending on the flavour you're running eg. KDE uses SDDM, Unity and Mate use LIGHTDM)
That should stop the display manager and your gui session so that the nvidia routines have a clear path to install without conflict
Now MANUALLY install nVidia driver. NOTE THAT OPTIONS ARE !!!!!CRITICAL!!!!!, other than the logfile name which you may want to change
That should work and to have created a new "nvidia" module. There's an extra step though, for some reason the nvidia specific entries in /dev are not always created so you'll need the following code to run at startup (I created a new script and execute it from /etc/rc.local). The script contents (as supplied by nVidia in their cuda installation guide)
Now reboot and check that
If that all worked ok you're ready to install cuda itself
Download the latest cuda (I need the latest because I have a very recent model card) from https://developer.nvidia.com/cuda-toolkit In my case I downloaded V8 beta (the latest at time of writing)
Once again it's probably best to shut the gui session down eg. sudo service lightdm stop) ... probably not necessary, but it's a whole lot better to be "safe" and maximise the probability of success.
Start by installing the dependencies for the nVidia components (thanks to Puget systems for this)
Now install the toolkit and samples (note that the toolkit must be installed as root)
. I found it necessary to change the ownership of the samples with
Then we need to add the libraries to ~/.bashrc by adding the lines
You might want to reboot again at this point, just be sure (no big deal since its a vm).
After rebooting. Open a new terminal session and
Assuming that worked ok, we can now compile all the samples
First we need to update some peculiarities of the nvidia supplied code (thanks to Puget systems for this code)
Compile the samples (assuming you saved the downloads to ~/Downloads/cuda and used the above commands)
Once that all completes (it takes a while, there are lots of them), execute
. You should see a summary of the card like this
At this point it's all done ! BE AWARE that the sample routines using opengl will fail
With any luck spice is still the main video driver, and the cuda routines are ready for use.
BUT I didn't want the card to attach to my monitor (I've run out of ports on the monitor on my desk), I want to address the vm using Spice as "normal" for a vm
Initially I tried installing from the repositories, but that just resulted in blank screens ... not very helpful, so experimented until i could get it to work
These 2 links were key for me .. plus some other general investigation and experimentation
http://docs.nvidia.com/cuda/cuda-ins...nux/index.html
https://www.pugetsystems.com/labs/hp...ascal-GPU-825/
This assumes you are able to create a VM and pass a video card to it. In my case I created a kubuntu vm using virt-manager (libvirt), WITHOUT passing the video card initially.
I used the q35 motherboard and UEFI bios ... which meant the VM needed to be modified BEFORE the install ( extra check-box on last dialog of virt-manager create)
Then, once the vm was installed, I shut it down and used virsh edit to go in and change the xml.
Add this at the top (replace the existing first line), it allows you to send arguments directly to qemu, which we need because Ubuntu's version of libvirt is just a little too old
Code:
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
Code:
<qemu:commandline>
<qemu:arg value='-cpu'/>
<qemu:arg value='host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,kvm=off,hv_vendor_id=some4321text'/>
</qemu:commandline>
Now you can attach the nVidia video card and restart the vm. The above qemu lines hide the hypervisor from the nvidia driver and add some windows performance optimisations (see above).
Once you have a VM, you've ensured the packages are up to date and it's running with the video card attached you'll need to blacklist the nouveau driver by creating a new file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents
Code:
blacklist nouveau
options nouveau modeset=0
Code:
sudo update-initramfs -u
Code:
lsmod |grep no
FROM THIS POINT FORWARD IT'S BEST TO SHUT DOWN THE WINDOW MANAGER (the gui) SO THERE ARE NO CONFLICTS. If using virt-manager gui there are menu options to send <ctl><alt><f2> etc., so do that to get to a command line and then execute one of the following (depending on the flavour you're running eg. KDE uses SDDM, Unity and Mate use LIGHTDM)
Code:
sudo service sddm stop
sudo service lightdm stop
Now MANUALLY install nVidia driver. NOTE THAT OPTIONS ARE !!!!!CRITICAL!!!!!, other than the logfile name which you may want to change
Code:
sudo sh NVIDIA-Linux-x86_64-367.44.run --no-opengl-files --log-file-name=~/Downloads/cuda/NVIDIA-driver-install.log --dkms -a
Code:
#!/bin/bash
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
NVDEVS=`lspci | grep -i NVIDIA`
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i
done
mknod -m 666 /dev/nvidiactl c 195 255
else
exit 1
fi
/sbin/modprobe nvidia-uvm
if [ "$?" -eq 0 ]; then
# Find out the major device number used by the nvidia-uvm driver
D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
mknod -m 666 /dev/nvidia-uvm c $D 0
else
exit 1
fi
- The "nvidia" module is loaded (use lsmod |grep nv)
- The "nvidia" module is attached to the video card (use "lspci -vn")
- The nvidia devices have been created in /dev (use 'ls /dev/nv*)', you should see 3 nvidia entries
- Execute cat /proc/driver/nvidia/version and check the version of the nvidia driver loaded (should be the one you downloaded and installed
If that all worked ok you're ready to install cuda itself
Download the latest cuda (I need the latest because I have a very recent model card) from https://developer.nvidia.com/cuda-toolkit In my case I downloaded V8 beta (the latest at time of writing)
Once again it's probably best to shut the gui session down eg. sudo service lightdm stop) ... probably not necessary, but it's a whole lot better to be "safe" and maximise the probability of success.
Start by installing the dependencies for the nVidia components (thanks to Puget systems for this)
Code:
sudo apt install dkms build-essential ca-certificates-java default-jre default-jre-headless fonts-dejavu-extra freeglut3 freeglut3-dev java-common libatk-wrapper-java libatk-wrapper-java-jni libdrm-dev libgl1-mesa-dev libglu1-mesa-dev libgnomevfs2-0 libgnomevfs2-common libice-dev libpthread-stubs0-dev libsctp1 libsm-dev libx11-dev libx11-doc libx11-xcb-dev libxau-dev libxcb-dri2-0-dev libxcb-dri3-dev libxcb-glx0-dev libxcb-present-dev libxcb-randr0-dev libxcb-render0-dev libxcb-shape0-dev libxcb-sync-dev libxcb-xfixes0-dev libxcb1-dev libxdamage-dev libxdmcp-dev libxext-dev libxfixes-dev libxi-dev libxmu-dev libxmu-headers libxshmfence-dev libxt-dev libxxf86vm-dev lksctp-tools mesa-common-dev x11proto-core-dev x11proto-damage-dev x11proto-dri2-dev x11proto-fixes-dev x11proto-gl-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev x11proto-xf86vidmode-dev xorg-sgml-doctools xtrans-dev libgles2-mesa-dev
Code:
sudo sh ~/Downloads/cuda/cuda_8.0.27_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-8.0 --override --no-opengl-libs
sh ~/Downloads/cuda/cuda_8.0.27_linux.run --silent --samples --samplespath=~/Downloads/cuda/cuda-8.0_samples --override --no-opengl-libs
Code:
sudo sh chown -R <me>:<me> ~/Downloads/cuda
Code:
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
After rebooting. Open a new terminal session and
- check the path (echo $PATH)
- check that the toolkit installed ok by executing the following which should display toolkit informationCode:
nvcc -V
Assuming that worked ok, we can now compile all the samples
First we need to update some peculiarities of the nvidia supplied code (thanks to Puget systems for this code)
Code:
# Fix Host config so GCC doesn't cause errors when compiling
sudo sed -i '/unsupported GNU version/ s/^/\/\//' /usr/local/cuda-8.0/include/host_config.h
# Fix hard coded driver id in Samples
sudo find /usr/local/cuda-8.0/samples -type f -exec sed -i 's/nvidia-3../nvidia-367/g' {} +
Code:
cd ~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/
make
Code:
~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/deviceQuery
Code:
~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1070"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8112 MBytes (8505589760 bytes)
(15) Multiprocessors, (128) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1785 MHz (1.78 GHz)
Memory Clock rate: 4004 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 6
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070
Result = PASS
With any luck spice is still the main video driver, and the cuda routines are ready for use.