User Details
- User Since
- Jan 23 2025, 9:03 PM (42 w, 4 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- SSalgaonkar-WMF [ Global Accounts ]
Today
@AikoChou or @BWojtowicz-WMF could you please provide a quick status update here sometime on Tuesday? it can be just a few simple bullet points that I can copy over into Asana
Tue, Nov 11
Thu, Nov 6
Sun, Oct 26
@gkyziridis Could you please write an update here about the overall status of this work, that I can use in Asana reporting? I have to submit these every Friday. I can write the sections about how many reviews we have per language right now, but would really appreciate your updates on the investigation and the overall ticket
Mon, Oct 20
@Samwalton9-WMF could you please keep me in the loop about timing for switching over to the other model? I don't think you need anything from us, but I just want to make sure we're available in case of issues
Oct 9 2025
Besides Teahouse, another useful resource could be the enwiki Reference Desk: https://en.wikipedia.org/wiki/Wikipedia:Reference_desk
Oct 4 2025
@AikoChou Great question, and great thinking! I'll start a thread for this in the WE1.1 channel so we can propose this and get feedback from Kirsten, Lauren, and Peter.
Oct 3 2025
@KStoller-WMF @DMburugu just chiming in to say that the prioritization of T392012 is blocked until we confirm whether this task is a priority under WE1.1. We started to talk about this prioritization decision here and I'm happy to continue the conversation whenever it makes sense!
Sep 23 2025
Sep 18 2025
Sep 15 2025
I've created Annotool instances for most of the additional languages in which we want to evaluate the model. Project numbers, translations, and labels for each of these languages can be found in this doc.
Sep 6 2025
Sep 2 2025
@Trizek-WMF good question! We still ideally want to have 5+ evaluators per language, like we did last time, but we don't want this to be a blocker to moving forward if we have a strong and consistent signal about the model's performance in that language. I believe that the number of samples per evaluator is the same as last time. cc @ppelberg
Aug 14 2025
Aug 1 2025
Jul 31 2025
This is kind of a unique case for us, since most of our hosting requests come from the Research team, so I'm sorry that we don't have a process built around this yet!
Hey @derenrich, sorry for the delayed follow-up. This is on me personally; I'm balancing a few time-sensitive items right now and a bit behind on the things that are non-blocking (as I understand this request to be - please correct me if I've misunderstood).
@Miriam yes totally, sorry for not doing that sooner! I'll do a quick pass of our board to see if there are other places where y'all are unnecessarily tagged
Jul 27 2025
We met on Friday to discuss Ilias' questions above and talk through some potential solutions. See meeting notes and meeting recording for more.
Jul 24 2025
Jul 15 2025
Jul 14 2025
Jul 8 2025
Hi y'all! Just wanted to send a BIG thank you to everyone who participated in evaluating the Tone Check model! We really appreciate your help.
Jul 2 2025
Hey @kostajh ! Thanks so much for submitting this! Totally agree with you about how the work described here would magnify the impact of the RR Filters Rollout, but I'm curious how this connects to FY2025-2026 OKRs. Can you please share more about whether this rollout impacts any KRs you're working towards - whether as a hard blocker / dependency or as a soft blocker / nice to have? I'm asking because in order to prioritize this ticket we'd like to figure out: (a) Cost of delay or non-prioritization, and (b) How we'd report on this work
Jun 26 2025
@DMburugu thank you so much for this helpful response!! It totally makes sense that you're figuring out the details of what this experience will look like, and how you'll be using the multilingual RR data, and we'd love to reconvene in Q1 as you get more clarity. For now, I'll list this request in our Intake Tracker with a status of "Info needed" and I'll list it in our roadmap as a potential item in late Q1/early Q2.
Jun 23 2025
Thanks so much for getting back to me, and no worries at all about timing @Kgraessle! I have a few more questions - please bear with me! These are questions that I'm discussing with all teams who are submitting requests to ML for Q1+, in order to help us prioritize and plan our roadmap. These are also questions that we haven't always asked in the past; as shared in our Engagement Model doc, we are moving towards an operating model in which we partner closely with teams toward a shared definition of success, rather than fulfilling requests without actually taking on the mission/purpose of the work.
Jun 18 2025
Hi @Kgraessle ! Can you please clarify the following?
Jun 3 2025
May 21 2025
May 19 2025
May 12 2025
Hi @DMburugu! Responsibility for this request has been transferred from the Research team to the ML team, and we will indeed have a hypothesis for Add-a-Link improvements (including model quality improvements and retraining pipelines) in Q1. This quarter we are planning the work by: auditing tickets like this one, conducting diagramming sessions like the one we had today, and working with you all to establish a set of product requirements for the next iteration of Add-a-Link. By the end of this quarter, we aim to have an implementation plan that has been reviewed and agreed upon by Growth, Research, and DPE - so that we can begin implementation in early Q1. I'm not sure whether this ticket will be directly incorporated into that plan, or if my engineering partners will propose a different solution to the problems and requirements we've discussed, but we'll definitely keep you looped in since your feedback is crucial in this process.
May 9 2025
@Kgraessle I've logged the second part of your request (re: the multilingual RR model) as a request to the ML team on our intake request tracker. Based on our recent conversations, my understanding is that the Moderator Tools team is moving forward with just the language-agnostic RR model for now. As such, the ML team is deprioritizing this request for support with the multilingual model. Please feel free to re-submit this request using this template if you need the datasets for the multilingual model in the future.
May 2 2025
May 1 2025
Apr 29 2025
Excited for this! Can we add a requirement about the demo implementing a probability score threshold (eg. only surfacing the peacock check when the probability score is 0.8 or higher)? This will allow us to demo something that is closer to what we expect the production experience to look like. We could also test different thresholds during these interviews if it makes sense to do so.
