For my volunteer profile, please visit: KCVelaga
User Details
- User Since
- Sep 15 2021, 11:36 AM (105 w, 1 d)
- Availability
- Available
- LDAP User
- KCVelaga (wikimf)
- MediaWiki User
- KCVelaga (WMF) [ Global Accounts ]
Yesterday
Thank you @JAllemandou!
Participants' user data gathering complete: https://github.com/wikimedia-research/cws-historical-metrics/blob/main/02-user_data_gathering.ipynb
(note, the data is not public, as it may contain potential PII)
Regarding random sampling, how balanced should the dataset be across all of the dimensions? If we want to ensure a minimum across a some or all of the dimensions, I'd suggest a stratified random sample.
Wed, Sep 20
@Pginer-WMF the Meta-Wiki page is getting quite long with a lot of tables. I would suggest moving tables older than preceding four quarters to an archive sub-page and linking it from the main one. I am happy to that do that if it sounds good to you.
Mon, Sep 18
Sun, Sep 17
Sat, Sep 16
Thu, Sep 14
@mpopov Yes, as of now, the main use case for this data will be the Quarterly Learning Sessions.
Sun, Sep 3
Mon, Aug 28
I was looking a bit further into the percentage of the reverts made by ClueBot NG that were reverted back. Although all of these are not necessarily false positives, it's interesting to see that the percentage of reverts reverted back has come down by about 50% over time (~16% at the beginning to ~8% in June 2023). I am wondering if you have any insight into what's causing this (it can be that the model is re-learning consistently, or the decrease in overall revert rate might be a factor, or something else).
Fri, Aug 25
@Pginer-WMF the list and the criteria used look good to me. If the languages have high usage of cx despite having no MT support, enabling MT will benefit the editors.
Aug 19 2023
- The bot usually monitors content namespaces and for every revert that was made, a message is posted to the user's talk page whose edit had been reverted.
- While there were edits made in content namespaces that are not reverts, and reverts in non-content namespaces, their frequency is insigificant, and don't add any value to the analysis.
- non-content reverts: 0.029% & content edits (excluding reverts): 0.009% of all time edits made by the bot
- From 2010 to early 2013, the bot made between 1500-2500 reverts on average per day.
- From 2013 to 2018, the average daily reverts were between 500-1000.
- From 2019 to mid-2020, the average daily reverts were between 150-250.
- Starting July 2020, the average daily reverts increased to ~500, a trend which continued until the end of 2021.
- There was a sharp drop in the daily reverts made since Jan 2022; it continued to drop until August 2022 when the bot made 65 reverts on average per day.
- From Sep 2022 to Jun 2023 (end of data), the average daily reverts were between 150-200.
Note: As the bot posts a talk page message for each revert (in most cases), the actual edit count would approximately be double of average reverts mentioned above.
Aug 17 2023
Here is the summary of the analysis:
Aug 16 2023
Here's the summary of further analysis.
Aug 15 2023
@nshahquinn-wmf thanks for checking on this.
Aug 14 2023
@BTullis I checked the files. I made a backup of a few data files, and the rest that is required are on GitHub/GitLab - so everything can be removed.
@Hghani (re)joined WMF on the newly formed Movement Insights team, and his access has been reinstated T322145#8862574
Aug 9 2023
- Weekly data is available at https://tr.wikipedia.org/wiki/%C3%96zel:ContentTranslationStats
- Usernames are no longer required
Aug 8 2023
Aug 2 2023
Jul 31 2023
@Samwalton9 I have updated the task description to reflect our discussion. Please add if I missed anything, or change as needed.
@Samwalton9 and I discussed the initial results (as mentioned below) and decided it would be best to expand the scope of this task to investigate further.
Jul 28 2023
@ppelberg Here is the summary of the analysis
Jul 26 2023
Jul 24 2023
Yes, I agree that there is substantial usage of Nuke on unregistered users.
@Samwalton9 Sorry, this took a bit longer than excepted. Although I gathered all-time logs, given the scope of the task i.e. to analyse nuke usage to inform changes required due to IP masking, only recent records (last 3 years) were considered for the analysis, which included ~240000+ nuke actions across wikis. My reasoning is that if there has been no usage of the feature at all during the last three years, it won't be helpful to identify wikis to talk to, and also 3 years is a reasonable amount of time to understand the distributions.
Jul 21 2023
@Samwalton9 I have a clarification question for the second question,
What is this as a percentage of all reverts made within 24 hours of an edit occurring?
I couldn't specifically understand, what "24 hours of an edit occurring" meant.
Jul 19 2023
1.) Because of an issue with EditAttemptStep's sanitization process, the analysis Megan and Mikhail are reviewing includes data from the previous 90 days as opposed to the past year
- Yes, they are reviewing the analysis that considers 90 days of event data, instead of the past year.
Jul 18 2023
Hi, @mforns! Given the recent team changes, we are not exactly sure which team(s) to tag for review. Megan suggested you might be able to help or redirect if needed. Thank you.