As an engineer on the ML team,
I want to have the appropriate software stack in order to be able to serve LLMs from our stack efficiently using our GPUs, so that LiftWing can power product features. The goal of this quarter is to:
- Use vllm to serve LLMs from the MI210 AMD GPUs
- Have the stack ready to do the same using the MI300X GPUs once we acquire them.
The path to do this is by using the vllm/rocm docker images provided by - upstream.
We also want to investigate if other frameworks (like SGLang) would be more suitable.

