One part: Test GPU on Hadoop
Description
Related Objects
Event Timeline
Regarding testing the GPU on Hadoop, I spoke with @fkaelin yesterday. He suggested a potentially suitable project - an end-to-end airflow pipeline. This would include a spark task to create the training dataset, a GPU task to train the model, and a spark task to batch evaluate.
For the training part, it should work with a Hadoop GPU, and with minor code changes, it could also work with a cloud GPU, which the research team plans to test. In the future, we could apply this end-to-end airflow pipeline with a GPU on ml-train for training a LLM. Therefore, it should be worth experimenting with.
The revertrisk-multilingual model could be a good first target for this pipeline. The model is trained using the same GPU on statbox, and an airflow dag has been created for generating the training datasets by Muniza.
Aiko to work on spike about GPU on Hadoop workflow and end to end airflow pipelne (data prep pipeline, training pipeline, model evaluation).