May 17th, 2024
Tonight, I had a Pimox node die on me. It looks like the microSD gave up the ghost. Once I identified the issue, I thought I had to build a new node and I couldn’t reuse the node configuration. I thought this would mean I’d have to restore the Virtual Machines from backup or reclaim their hard drives from my shared storage. I was wrong.
You can Replace a Node
Once a node dies, you can remove the node and replace it with a new one. Using the same name, the new node will inherit the VMs and the VMs will boot up, etc. In my case, since it was the microSD card and not the Pi itself, I used the same Rapsberry Pi with a fresh install of Proxmox.
You will find posts saying you can’t re-use the name or the ip address or bad things will happen. This isn’t true. As long as you don’t power on the original node with the original install of Proxmox, you are fine. So say the official instructions and the official instructions worked for me.
How to remove a node:
https://pve.proxmox.com/wiki/Cluster_Manager#_remove_a_cluster_node
Rebuild with a new install of Pimox onto the same pi. Make sure you install a new copy of Proxmox VE to the pi, using the old one will cause issues.
Add the node back:
https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_join_node_to_cluster
The replacement node will pick up all the old node’s VMs, LXC, & Templates. The assets will be there as if nothing had happened. However, just like any computer abruptly powered down, you might have issues with the VMs, they were “turned off” without being shutdown properly.
Note – don’t forget to add a network bridge or your VMs won’t boot and will throw “error code 1”
If you aren’t going to replace the Node:
You can restore the VMs from backup to another node.
if you don’t have recent backups, and you are using shared storage, recover the vms by either:
Finding the config folder in /etc/pve/nodes and move the vm configs to another node
recover the VM by hooking the hard drives to a new, diskless VM.
https://forum.proxmox.com/threads/how-to-create-virtual-machine-with-existing-disk.26142
“If it doesn’t exist anymore, you need to create a new VM with the same settings and without any disks, but using the same ID as the disk has. Then run qm rescan –vmid <ID>. After that, the disk will show up as an unused disk for the VM and can be attached in the UI.”
Remove the Node from the GUI
Finally, removing the node from the CLI will not remove it from the GUI. The configuration files for the node are still there. To remove the node from the GUI, follow these instructions:
“Later we remove the node from Proxmox GUI. For this, we login to any active node. Then we check /etc/pve/nodes folder. And we check for the folder name of the removed node and remove it. Otherwise, the GUI displays the removed node in it.”
https://bobcares.com/blog/proxmox-remove-node-from-cluster/
Also, rather than deleting the folder, you can rename the folder so you can save the configuration files.
Conclusion
These are just two ways to deal with a Pimox node that has died and either needs to be replaced or a new node added. We also discussed options of how to recover the VMs that were on the dead node.