Outcomes
Description
Description
Details
Details
- Other Assignee
- Isaac
Event Timeline
Comment Actions
Weekly Updates
- @MunizaA has been testing the feasibility and utility of using Wikidata Embeddings, both for Item Quality and Revert Risk. We have studied different implementations, and experimenting with the PyTorch BigGraph model. We have been able to train on medium-size subgraphs. While the training on large graphs seems to be possible, we are still evaluating the value of such embeddings for the proposed tasks.
- We have tested specific approaches for different types of actions. Eg: One language-based model to assess quality of descriptions and labels, and other models for claims containing triples (Q_x P_y Q_z). This is improving the performance and quality of our results.
Comment Actions
Weekly updates
- @MunizaA has created an efficient pipeline to train HuggingFace Transformers, using the GPUs from the stat machines, and data coming from the Data Lake.
- We are experimenting with different LLM such as mBert and Roberta, to detect vandalism on Item Descriptions.
Comment Actions
Weekly Updates
- We have develop a meta-model. This model has two main components.
- The first one is a Catboost based classifier, designed to assess the Revert Risk for claims set and updates.
- The second model is an hybrid approach, designed to evaluate Revert Risk on Wikidata Item Descriptions. This model uses mBert.
- @MunizaA has developed a methodology for creating clean training data for the mBert Model
- @MunizaA is now working on implementing this model, and the feature extraction pipeline by updating the Knowledge Integrity Repo.
Comment Actions
Weekly Updates
- We are finalizing the feature extraction pipeline code and the code to serve the model on LiftWing.
Comment Actions
Weekly Updates
- The first version of this model is ready to go to LiftWing.
- @MunizaA has submitted a merge request. Now @achou is reviewing the code.
- I'll be meeting with @Lydia_Pintscher next week to show the results and discuss next steps.
- We are planning to create and upload the model card next week.
Comment Actions
Weekly Updates
- The model card for Multilingual model is available here.
- We are working with Lydia to evaluate the model, and update if needed.
Comment Actions
Weekly Updates
- @MunizaA is working on evaluation tool that would be usable by all the Revert Risk Models, including the Wikidata on as well as the LA and Multilingual for Wikipedia
Comment Actions
- Weekly Updates**
- We have met with Lydia and community developers. We are going to share our code with them and we have also learn about their efforts on automatic content patrolling in Wikidata.
- The evaluation tool code is ready, this week @MunizaA would upload this to a public end-point (toolforge or wmfcloud).
Comment Actions
Weekly updates
- I'm currently working on the Model Card for this algorithm.
- @MunizaA please notify us in this ticket when the annotation tool app is ready.
- We are preparing the code to be shared with @Lydia_Pintscher and (through her) with volunteer developers to test the current algorithm on their own datasets.
Comment Actions
Weekly Updates
- @MunizaA has released an alpha version of the evaluation tool. Results for Wikidata Model can be found here.
- For Wikidata Revert Risk, I'm going to upload thetraining and testing code, plus the model on public repo, and then open another task for model's evaluation and improvements.
- Regarding the Item Quality model, I'm going to coordinate with @Isaac for the follow-ups on that project.
Comment Actions
Weekly Updates
- The Wikidata Revert Risk model is now available for testing on this PAWS notebook.
I'm going to resolve this task and add the evaluation and improvements in a new ticket.
Comment Actions
reopening this task as research engineering and deployment component is not done yet.
Comment Actions
no capacity for rewriting the model at the moment - the model cannot be easily adopted by the current version without creating considerably amount of maintenance
will re-discuss priority along with @Lydia_Pintscher