Page MenuHomePhabricator

2024 Q4 Goal: An HuggingFace 7B LLM is hosted on ml-staging on Lift Wing powered by GPU
Open, Needs TriagePublic

Event Timeline

  • GPU order for the first GPU 2x chassis is close to complete. There are some supply issues with the chassis, so the question is going to be if we want to use an upgraded chassis for the ml-staging server.

Update: We have Mistral-7b-instruct hosted on ml-staging that uses a CPU and is using the pytorch base image that we have created. A simple request takes approx 30s (haven't run extensive tests yet).
We are facing some issues using the GPU with this docker image at the moment as documented in T362984: GPU errors in hf image in ml-staging.