Adventures in upgrading Proxmox

Running Docker inside LXC is strange. It’s containers on top of other containers, and there was a recent issue with AppArmor that prevented some functionality from running inside a Docker container with a very cryptic error. I was trying to deploy coolify and/or dokploy in my homelab and am having all kinds of weird issues. I finally found this GitHub issue runcAnd, apparently, it was fixed in the new version pve-lxc package. But I’m still on Proxmox 8, and the new version seems to be available only in Proxmox 9.

I’ve upgraded one node with no problems, but the other node, which runs my NVR and has the Coral TPU, gave me some grief. Because the Apex drivers are installed as DKMS modules, it failed to rebuild, disrupting the system upgrade process. Not sure how exactly, but after reboot the system did not come back online. The machine is in the basement, which means I have to take my USB KVM and go downstairs…

As an aside… because one node won’t start, and there are only two nodes in my Proxmox cluster, it can’t reach quorum, which means I can’t actually make any changes to my other node, and I can’t start any of the containers that are stopped.
I recently added another ZigBee dongle that supports threads, and it shares the same VID:PID combo as the old dongle, so all my light switches stopped working due to how these were mapped in the guest OS. I had to solve the problem quickly.

Thankfully I was able to access the GRUB screen and select a previous kernel, so I was able to boot into the machine. This was a plus, but attempting to reboot into the new kernel still caused panic.

Google suggested that unable to mount rootfs on unknown-block(0,0) Error indicates missing problem initrdthat needs to be revived update-initramfs -u -k ${KERNEL_VERSION}Despite some degree of mystery, it ran successfully no /etc/kernel/proxmox-boot-uuids found Message. After reboot it got kernel-panicked again, even though /boot/initrd-${VERSION} The files existed. I think the error is relevant. After another quick Google search I found this Reddit thread that provided steps to resolve this issue.

lsblk -o +FSTYPE | grep /boot/efi # understand which device the EFI partition is on
unount /boot/efi
proxmox-boot-tool init /dev/${DEVICE} # plug in device from step 1
mount /boot/efi
update-initramfs -u -k all
reboot

This created the necessary files and after rebooting the system was able to boot again with the new kernel.

While trying to troubleshoot I have also uninstalled the Apex DKMS module, and now I had to reinstall it, but due to the kernel change it started failing with errors.

Apparently some symbols/APIs have become obsolete and I had to patch the source code. It appears that upstream did not have this, but I found the necessary changes:

diff --git a/src/gasket_core.c b/src/gasket_core.c
index b1c2726..88bd5b2 100644
--- a/src/gasket_core.c
+++ b/src/gasket_core.c
@@ -1373,7 +1373,9 @@ static long gasket_ioctl(struct file *filp, uint cmd, ulong arg)
 /* File operations for all Gasket devices. */
 static const struct file_operations gasket_file_ops = {
        .owner = THIS_MODULE,
+#if LINUX_VERSION_CODE < KERNEL_VERSION(6,0,0)
        .llseek = no_llseek,
+#endif
        .mmap = gasket_mmap,
        .open = gasket_open,
        .release = gasket_release,
diff --git a/src/gasket_page_table.c b/src/gasket_page_table.c
index c9067cb..0c2159d 100644
--- a/src/gasket_page_table.c
+++ b/src/gasket_page_table.c
@@ -54,7 +54,7 @@
 #include 
 
 #if __has_include()
-MODULE_IMPORT_NS(DMA_BUF);
+MODULE_IMPORT_NS("DMA_BUF");
 #endif
 
 #include "gasket_constants.h"

After doing this and running the build process again (as described in the previous post), the driver installed and I was able to get the frigate back up.

thank you very much to /u/Dunadan-F For solution.



Leave a Comment