We should upgrade the AMD GPU plugin to get new patches like https://github.com/ROCm/k8s-device-plugin/pull/117, and support MI300's native partitioning.
Info about the upgrade in https://gerrit.wikimedia.org/g/operations/debs/amd-k8s-device-plugin/+/refs/heads/master
Bonus points: we should think about adding the node labeller as well, to be able to target specific GPU details (how much VRAM, their model, etc..) when targeting a GPU in helmfile deployments (as opposed to just ask for a generic GPU). This may help when MI300x will be available, because we'll likely have different VRAM partitions etc..