[SOLVED] Cuda On Pascal In A KVM VM via VGA Passthrough

I've just purchased a new Pascal video card, in part for deep learning applications. I wanted to be able to use the card in a VM (because this all runs on my one and only server along with lots of other stuff including windows games)
BUT I didn't want the card to attach to my monitor (I've run out of ports on the monitor on my desk), I want to address the vm using Spice as "normal" for a vm

Initially I tried installing from the repositories, but that just resulted in blank screens ... not very helpful, so experimented until i could get it to work

These 2 links were key for me .. plus some other general investigation and experimentation
http://docs.nvidia.com/cuda/cuda-ins...nux/index.html
https://www.pugetsystems.com/labs/hp...ascal-GPU-825/

This assumes you are able to create a VM and pass a video card to it. In my case I created a kubuntu vm using virt-manager (libvirt), WITHOUT passing the video card initially.
I used the q35 motherboard and UEFI bios ... which meant the VM needed to be modified BEFORE the install ( extra check-box on last dialog of virt-manager create)

Then, once the vm was installed, I shut it down and used virsh edit to go in and change the xml.
Add this at the top (replace the existing first line), it allows you to send arguments directly to qemu, which we need because Ubuntu's version of libvirt is just a little too old

Code:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

Then add these lines just before the </domain> tag at the end

Code:

  <qemu:commandline>

    <qemu:arg value='-cpu'/>

    <qemu:arg value='host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,kvm=off,hv_vendor_id=some4321text'/>

  </qemu:commandline>

Note that the "hv_" parameters are Windows specific "enlightenments" ... I just add them for consistency when creating vms for nvidia gpu passthrough. You only NEED the "value=....,kvm=off,vendor_id=some4321text" not the "hv_" elements.

Now you can attach the nVidia video card and restart the vm. The above qemu lines hide the hypervisor from the nvidia driver and add some windows performance optimisations (see above).

Once you have a VM, you've ensured the packages are up to date and it's running with the video card attached you'll need to blacklist the nouveau driver by creating a new file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents

Code:

blacklist nouveau                                                                                                                                          

options nouveau modeset=0

The rebuild intiramfs by executing the following

Code:

sudo update-initramfs -u

Now reboot so the nouveau driver is not loaded, and check after reboot with

Code:

lsmod |grep no

Download the latest nVidia driver from here https://www.nvidia.com/Download/index.aspx?lang=en-us. I saved the driver "run" file to ~/Downloads/cudaNOTE that you will want to download the "run" version NOT the deb files !!

FROM THIS POINT FORWARD IT'S BEST TO SHUT DOWN THE WINDOW MANAGER (the gui) SO THERE ARE NO CONFLICTS. If using virt-manager gui there are menu options to send <ctl><alt><f2> etc., so do that to get to a command line and then execute one of the following (depending on the flavour you're running eg. KDE uses SDDM, Unity and Mate use LIGHTDM)

Code:

sudo service sddm stop

sudo service lightdm stop

That should stop the display manager and your gui session so that the nvidia routines have a clear path to install without conflict

Now MANUALLY install nVidia driver. NOTE THAT OPTIONS ARE !!!!!CRITICAL!!!!!, other than the logfile name which you may want to change

Code:

sudo sh NVIDIA-Linux-x86_64-367.44.run --no-opengl-files --log-file-name=~/Downloads/cuda/NVIDIA-driver-install.log --dkms -a

That should work and to have created a new "nvidia" module. There's an extra step though, for some reason the nvidia specific entries in /dev are not always created so you'll need the following code to run at startup (I created a new script and execute it from /etc/rc.local). The script contents (as supplied by nVidia in their cuda installation guide)

Code:

#!/bin/bash



/sbin/modprobe nvidia



if [ "$?" -eq 0 ]; then

  # Count the number of NVIDIA controllers found.

  NVDEVS=`lspci | grep -i NVIDIA`

  N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`

  NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`



  N=`expr $N3D + $NVGA - 1`

  for i in `seq 0 $N`; do

    mknod -m 666 /dev/nvidia$i c 195 $i

  done



  mknod -m 666 /dev/nvidiactl c 195 255



else

  exit 1

fi



/sbin/modprobe nvidia-uvm



if [ "$?" -eq 0 ]; then

  # Find out the major device number used by the nvidia-uvm driver

  D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`



  mknod -m 666 /dev/nvidia-uvm c $D 0

else

  exit 1

fi

Now reboot and check that

The "nvidia" module is loaded (use lsmod |grep nv)
The "nvidia" module is attached to the video card (use "lspci -vn")
The nvidia devices have been created in /dev (use 'ls /dev/nv*)', you should see 3 nvidia entries
Execute cat /proc/driver/nvidia/version and check the version of the nvidia driver loaded (should be the one you downloaded and installed

If that all worked ok you're ready to install cuda itself

Download the latest cuda (I need the latest because I have a very recent model card) from https://developer.nvidia.com/cuda-toolkit In my case I downloaded V8 beta (the latest at time of writing)

Once again it's probably best to shut the gui session down eg. sudo service lightdm stop) ... probably not necessary, but it's a whole lot better to be "safe" and maximise the probability of success.

Start by installing the dependencies for the nVidia components (thanks to Puget systems for this)

Code:

sudo apt install dkms build-essential ca-certificates-java default-jre default-jre-headless fonts-dejavu-extra freeglut3 freeglut3-dev java-common libatk-wrapper-java libatk-wrapper-java-jni  libdrm-dev libgl1-mesa-dev libglu1-mesa-dev libgnomevfs2-0 libgnomevfs2-common libice-dev libpthread-stubs0-dev libsctp1 libsm-dev libx11-dev libx11-doc libx11-xcb-dev libxau-dev libxcb-dri2-0-dev libxcb-dri3-dev libxcb-glx0-dev libxcb-present-dev libxcb-randr0-dev libxcb-render0-dev libxcb-shape0-dev libxcb-sync-dev libxcb-xfixes0-dev libxcb1-dev libxdamage-dev libxdmcp-dev libxext-dev libxfixes-dev libxi-dev libxmu-dev libxmu-headers libxshmfence-dev libxt-dev libxxf86vm-dev lksctp-tools mesa-common-dev  x11proto-core-dev x11proto-damage-dev  x11proto-dri2-dev x11proto-fixes-dev x11proto-gl-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev x11proto-xf86vidmode-dev xorg-sgml-doctools xtrans-dev libgles2-mesa-dev

Now install the toolkit and samples (note that the toolkit must be installed as root)

Code:

sudo sh ~/Downloads/cuda/cuda_8.0.27_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-8.0 --override  --no-opengl-libs

sh ~/Downloads/cuda/cuda_8.0.27_linux.run --silent --samples --samplespath=~/Downloads/cuda/cuda-8.0_samples --override  --no-opengl-libs

. I found it necessary to change the ownership of the samples with

Code:

sudo sh chown -R <me>:<me> ~/Downloads/cuda

Then we need to add the libraries to ~/.bashrc by adding the lines

Code:

export PATH=/usr/local/cuda-8.0/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

You might want to reboot again at this point, just be sure (no big deal since its a vm).

After rebooting. Open a new terminal session and

check the path (echo $PATH)
check that the toolkit installed ok by executing the following which should display toolkit information

Code:

nvcc -V

Assuming that worked ok, we can now compile all the samples

First we need to update some peculiarities of the nvidia supplied code (thanks to Puget systems for this code)

Code:

# Fix Host config so GCC doesn't cause errors when compiling

sudo sed -i '/unsupported GNU version/ s/^/\/\//' /usr/local/cuda-8.0/include/host_config.h

# Fix hard coded driver id in Samples 

sudo find /usr/local/cuda-8.0/samples -type f -exec sed -i 's/nvidia-3../nvidia-367/g' {} +

Compile the samples (assuming you saved the downloads to ~/Downloads/cuda and used the above commands)

Code:

cd ~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/

make

Once that all completes (it takes a while, there are lots of them), execute

Code:

~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/deviceQuery

. You should see a summary of the card like this

Code:

~/Downloads/cuda/cuda-8.0_samples/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/deviceQuery Starting...



 CUDA Device Query (Runtime API) version (CUDART static linking)



Detected 1 CUDA Capable device(s)



Device 0: "GeForce GTX 1070"

  CUDA Driver Version / Runtime Version          8.0 / 8.0

  CUDA Capability Major/Minor version number:    6.1

  Total amount of global memory:                 8112 MBytes (8505589760 bytes)

  (15) Multiprocessors, (128) CUDA Cores/MP:     1920 CUDA Cores

  GPU Max Clock rate:                            1785 MHz (1.78 GHz)

  Memory Clock rate:                             4004 Mhz

  Memory Bus Width:                              256-bit

  L2 Cache Size:                                 2097152 bytes

  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)

  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers

  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 65536

  Warp size:                                     32

  Maximum number of threads per multiprocessor:  2048

  Maximum number of threads per block:           1024

  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)

  Run time limit on kernels:                     No

  Integrated GPU sharing Host Memory:            No

  Support host page-locked memory mapping:       Yes

  Alignment requirement for Surfaces:            Yes

  Device has ECC support:                        Disabled

  Device supports Unified Addressing (UVA):      Yes

  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 6

  Compute Mode:

     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >



deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070

Result = PASS

At this point it's all done ! BE AWARE that the sample routines using opengl will fail

With any luck spice is still the main video driver, and the cuda routines are ready for use.

[SOLVED] Cuda On Pascal In A KVM VM via VGA Passthrough

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List