Proxmox GPU Passthough to Ubuntu VM
For this project I am using a repurposed old desktop, that is now a whitebox proxmox cluster node. I am using Proxmox 5.3-5 at the time of install.
Tech Specs
- MSI 970A-G46 AM3+/AM3 AMD 970
- AMD FX-6100 Zambezi 6-Core 3.3 GHz Socet AM3+
- 2 x 8GB G.Skill Ripjaws X Series 240-pin DDR3 1866 RAM
- Intel Quad NIC
- GeForce GTX 970
- SSD 240GB
Boot into the BIOS and ensure the following is supported and enabled.
- Ensure VT-d is enabled
- Interrupt remapping is supported or an option for IOMMU is enabled
- UEFI BIOS
- Ensure another video card is enabled. For me it was the integrated on board graphics card.
Configure Proxmox node
Edit GRUB to enable IOMMU.
nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on efifb=off"
update-grub
Verify that IOMMU is enabled.
dmesg | grep -e DMAR -e IOMMU
[ 0.000000] DMAR: IOMMU enabled
[ 1.219719] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
Add the Nvidia and Nouveau drivers to the blacklist to prevent them from being loaded by Proxmox.
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u
Verify which driver has been loaded, below it is still the nouveau driver. The point is that it is recognized by Proxmox.
lspci -v
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation GM204 [GeForce GTX 970]
Flags: bus master, fast devsel, latency 0, IRQ 70, NUMA node 0
Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel driver in use: nouveau
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)
Subsystem: NVIDIA Corporation GM204 High Definition Audio Controller
Flags: bus master, fast devsel, latency 0, IRQ 68, NUMA node 0
Memory at fb080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
Add Virtual Function IO (vfio) kernel modules to load at boot time.
nano /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
Determine the GeForce GTX 970 address and IDs. As seen above in the lspci output it was 01:00.
lspci -n -s 01:00
01:00.0 0300: 10de:13c2 (rev a1)
01:00.1 0403: 10de:0fbb (rev a1)
Assign the GeForce GTX 970 to the Virtual Function IO (vfio).
echo "options vfio-pci ids=10de:13c2, 10de:0fbb" > /etc/modprobe.d/vfio.conf
If the motherbaord does nothave an option for IOMMU interrupt remapping, you may have to enable it.
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
Reboot the Proxmox node. Verify that the GeForce GTX 970 is loaded and using the vfio-pci Kernel dirver. Ensure that it has capabilities listed as well.
lspci -v
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation GM204 [GeForce GTX 970]
Flags: bus master, fast devsel, latency 0, IRQ 70, NUMA node 0
Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [size=128]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)
Subsystem: NVIDIA Corporation GM204 High Definition Audio Controller
Flags: bus master, fast devsel, latency 0, IRQ 69, NUMA node 0
Memory at f5080000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
Create Ubuntu 18.04 VM
Create VM with UEFI bios
Start the VM without the graphics card and enable SSH. Once SSH is enabled, only use SSH to interact it with it to prevent and display issues. You will not be able to interact with it via VNC, noVNC, etc.
sudo apt update
sudo apt install openssh-server
sudo systemctl status ssh
Once the VM is installed and SSH is enabled. Power down the VM and add the graphics card to the VM. On the Proxmox host edit the vm conf file.
hostpci0: is the Proxmox host 01:00 is the addess of the GeForce GTX 970 on the Proxmox host x-vga=on enables vfio-vga device support
nano /etc/pve/qemu-server/(VMID).conf
hostpci0: 01:00,x-vga=on
Add the repository for the nvidia driver. I tried using the standard repos however, those drivers did not work.
sudo apt-add-repository ppa:xorg-edgers/ppa
sudo apt-get install nvidia-375 nvidia-libopencl1-375
Blacklist the nouvea driver to prevent it from being used for the graphics card.
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo update-initramfs -u
Edit GRUB to prevent issues with booting.
sudo nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
# and change it to
GRUB_CMDLINE_LINUX_DEFAULT="nomodeset quiet splash"
sudo update-grub
At boot ensure the card is found the driver can communicate with the card.
sudo nano /etc/rc.local
#!/bin/bash
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
N3D=`/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
NVGA=`lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i;
done
mknod -m 666 /dev/nvidiactl c 195 255
else
exit 1
fi
exit 0
Now Reboot the VM. You should have a working GeForce GTX 970 in an Ubunutu 18.04 VM.