- Goals:
- Generate metrics from first phase for article categorization(baseline datasets from Aitolkyn, prompting strategy from Mykola)
- Generate metrics for optimized tooling (quantized models; using onnx, ggml/llama.cpp)
- Generate metrics for model fine-tuning (what LLM models can we fine-tune with WMF infra), without focus on improving model quality
- Infrastructure: the ml-lab instances that each have two AMD Instinct MI210 GPUs.
- Stretch goals:
- Explore model parallel inference (as we have 2 GPU per instance)
- Fine-tuned model for article categorization classification task (e.g. no just for metrics, try to learn a good model)