Request details
- What use case is the model going to support/resolve?**
Detecting bad edits on Wikidata. Similar to Wikipedia Revert Risk, the Wikidata version is an improvement on previous models (ORES) that are currently running on LiftWing
- Do you have a '''model card'''? If you don't know what it is, please check https://meta.wikimedia.org/wiki/Machine_learning_models.**
Not yet, but we have a peer-reviewed paper published that explains the model.
- What team created/trained/etc.. the model? What tools and frameworks have you used?**
Team: research
Main Technology: BERT
- What kind of data was the model trained with, and what kind of data the model is going to need in production (for example, calls to internal/external services, special datasources for features, etc..) ?**
Model was trained using the research content-diff data. On inference, the model just need to make calls to WikiBase. Approach is similar to the the used on the Wikipedia Revert Risk Multilingual
- If you have a minimal codebase that you used to run the first tests with the model, could you please share it?**
- State what team will own the model and please share some main point of contacts (see more info in '''Ownership of a model''').**
Model was developed by the Research team. The productization is requested by Wikimedia Enterprise.
- What is the current latency and throughput of the model, if you have tested it?** We don't need anything precise at this stage, just some ballparks numbers to figure out how the model performs with the expected inputs. For example, does the model take ms/seconds/etc.. to respond to queries? How does it react when 1/10/20/etc.. requests in parallel are made? If you don't have these numbers don't worry, open the task and we'll figure something out while we discuss about next steps!
Architecture is the same than Revert Risk Multilingual. Similar serving time should be expected.
A "wme" tier rate limit which is 200 K requests per hour
https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage#Request_a_bearer_token
- Response time for 90% of the requests should be <= 500 ms
- Is there an expected frequency in which the model will have to be retrained with new data?** What are the resources required to train the model and what was the dataset size?
Recommend: Monthly
Critical: Yearly.
- Have you checked if the output of your model is safe from a human rights point of view? **Is there any risk of it being offensive for somebody? Even if you have any slight worry or corner case, please tell us!
Model has been evaluated using
- Everything else that is relevant in your opinion.**
Timing
Target delivery by end of December. WME has a contractual latest possible date to release our v1 of Wikidata product in January, and it would be ideal to launch with RR.