This work includes the following steps:
- Generate initial ~10K high-quality tasks offline (en, fr, ar, and pt)
- Bootstrap/initial ingestion to Cassandra and Search weight tags (manually, one-time)
The Search platform can assist with bootstrap/initial ingestion, as they have a manual script for ingesting to weighted tags. Ingestion to Cassandra needs investigation.
We ended up using Lift Wing to do initial ingestion to both Search weighted tags and Cassandra. The ingestion script we use is at https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/1211159
After initial ingestion, we'll use T408538: Create a Revise Tone Task Generator in LiftWing to update the task list.
Dependant tasks: