The aim of this task is to test AMD GPUs on our stack and identify the challenges/blockers (if any) of using them.
We want to test the following:
- run some open source LLM models
- deploy/serve an LLM that has been trained on an Nvidia GPU
- Kserve:
- deploy/serve a model using kserve
- investigate how to share a GPU among multiple models and what the community is doing on this topic. There are two approaches we should explore:
- Share a GPU among two or more pods
- share GPU among multiple models in one pod.
- Kubeflow: run an example of a training pipeline where the training step/pod uses a GPU. To test this we can install Kubeflow on minikube/kind and make the AMD GPU available. Test common frameworks(Pytorch, Tensorflow) with the GPU.