Fri, Sep 13
this looks great -- thanks @mforns for writing this up so clearly!
Thu, Sep 12
I'm currently exploring how to expand the ORES drafftopic model to languages besides English. The approach I have taken to this is to build a model that predicts the ~40 categories used by the ORES drafttopic model.
Wed, Sep 11
Regarding the sampling rate for Polish Wikipedia, their page views are generally about one third of those to Russian Wikipedia (https://tools.wmflabs.org/siteviews/?platform=all-access&source=pageviews&agent=user&range=this-year&sites=ru.wikipedia.org|pl.wikipedia.org) and unique devices are also a bit under one third (https://stats.wikimedia.org/v2/#/pl.wikipedia.org/reading/unique-devices/normal|line|2-year|~total|monthly). Based on this, I will recommend setting the sampling rate to three times that of Russian Wikipedia.
Tue, Sep 10
Mon, Sep 9
Wed, Sep 4
Tue, Sep 3
Thanks for the ping @Elitre -- I'm comfortable with wrapping up this task if @Trizek-WMF is. There shouldn't be any further community relations support until we start sharing out results, but we can open a new task then if it is deemed necessary.
Mon, Aug 26
My first attempt at building a language-independent means of representing article topic -- i.e. grouping article page views into categories regardless of which Wikipedia language edition that article was read in -- was to map each page view to its Wikidata item and then represent the item based on its instance-of / subclass-of properties. An example of how this might work is below . The goal is to build a set of higher-level categories to which any Wikidata item with an instance-of property can be mapped to. This is similar to the 14 categories in the Wikidata Concepts Monitor but ideally with no overlap and full coverage -- i.e. all items map deterministically to a single category.
I presented preliminary results at Wikimania: https://wikimania.wikimedia.org/wiki/2019:Research/Characterizing_Reader_Behavior_on_Wikipedia
Fri, Aug 23
- See reader behavior features under T228285 for features that were used in debiasing.
- It was determined that a GradientBoostingClassifier performed best with respect to making the average features -- e.g., average pages viewed per session) for the survey respondents match the general population for the wiki -- though LogisticRegression also worked quite well in many cases.
- Wikidata instance-of ended up being relatively uninformative so I might revisit that with drafttopic categories.
- As part of this work, a few changes were required:
- African and Worldwide surveys (english/french) were separated because I realized that weights from debiasing would not be comparable if they came from two separate models (if a single model was used for english or french, country was a very strong predictor of whether someone took the survey or not)
- I trimmed the control sessions to exactly match the survey session timespan because the survey was launched / ended mid-day and that meant without careful control, that day of week became a strong predictor of whether someone took the survey or not.
Thu, Aug 22
Aug 18 2019
Aug 13 2019
Surveys completed -- thanks @pmiazga (and any respondents)!
Yes, I've been compiling the feedback and will make sure to include some recommendations regarding the functionality of QuickSurveys for this type of survey. In particular:
- Ability to remove survey without responding explicitly. This should be a relatively straightforward UI fix but would require someone to do the work obviously.
- Ability to opt out completely from surveys like this. This would likely be more complicated as it would have to be a change to the user settings, databases, etc. and we would have to determine if it only applied to QuickSurveys or other extensions as well.
- Confusion around survey re-appearing in new browsers (or the same browser if cookies are deleted). This is how the sampling works but this is not at all evident to respondents, so there can be confusion when someone responds and then sees the survey again. I don't believe we can change the sampling strategy to be aware of whether a user account has already taken the survey in a different browser without compromising privacy, but we should consider having a better explanation of this within the survey to head off concerns for any future deployments.
- General frustration/anger with the presence of the survey (in some part tied to the difficulty of opting out from it).
Aug 9 2019
Sufficient responses have been reached -- plan is to undeploy these surveys in first available SWAT deployment by Tuesday (Aug 13).
Aug 8 2019
Thanks for the notification @Jc86035
Aug 7 2019
Sounds good -- FYI I'm responding to some comments on the meta page as I see them: https://meta.wikimedia.org/wiki/Research_talk:Surveys_on_the_gender_of_editors
Aug 6 2019
Per IRC conversation, leaving this open until we un-deploy the surveys and then I will sign off. Thanks!
Aug 5 2019
Thanks - hopefully launching in one hour!
Aug 2 2019
Per conversation, I have updated the local pages with the translations and will monitor them just to make sure that when the official versions are pushed late next week, nothing breaks. See T227793#5388113 for links.
Missing Norwegian translations have been completed in translatewiki:
Translations are supposed to go by the train. If you want to start earlier, you can override the local messages:
Wow - thanks, quite quick! Much appreciated! These will hopefully go out Monday now
Jul 31 2019
Thanks @Trizek-WMF : one issue came up during launch today. It turns out we're missing Bokmål translations for two buttons in the survey (sorry I wasn't aware until now):
Jul 30 2019
@pmiazga thanks for reaching out about the differences -- responses below:
Jul 29 2019
Just for completeness sake, here's the post to English Village Pump as well: https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(miscellaneous)&diff=908411293&oldid=908409651
The deletion proposal has been closed as "keep" and awaits archiving. You've handled it a great way!
Ahh yes excellent news -- it was closed right after I posted that message :) Thanks!
Jul 26 2019
Sounds good to me -- thanks!
what are the next steps? :)
Jul 25 2019
Great -- I requested July 31st for the deployment and I think it's safe to say that's the day it will be deployed (perhaps +- 1 day but that still gives at least three days). So if that sounds good to you, let's fill in that date and move forward with posting. Thanks!
Jul 24 2019
Thanks @Trizek-WMF ! I've confirmed that Bokmål is complete now too. At this point, I'm going to reach out to the Readers team (who will be supporting the deployment of this survey) and check whether they have availability next week. Do you think that would be sufficient time if we post these notices on the respective Village Pumps and Space in the next day?
Jul 23 2019
Jul 22 2019
My assessment of the files that were recommended for release from mtizzoni's home directory:
Jul 19 2019
Jul 18 2019
In the task description, concerning "Obtaining translations for village pump conversations", what does it mean precisely? Do you need to get the translations of the message (included kin the request to translators) or do you need help to translate the replies? I may be lost in translations here. :)
@Trizek-WMF : yeah, we're looking for translations of the message included in the translation docs (The Wikimedia Foundation [[mw:Wikimedia_Research|Research]] and ...) so that it can be posted in the editing community's language. for any replies, we might need support, but i suspect that Google Translate etc. will be good enough for getting the gist if the responses are in Norwegian / Arabic.
@elukey all good thanks -- there had been a hold-up while we were trying to figure out which files were the ones that were being proposed to release publicly, but that's been sorted out now so privacy review should move quickly on our side now. Thanks for checking in!
Jul 17 2019
I will focus on the specific analyses after processing of data, debiasing, feature generation is completed
The users can also dismiss it via Prefer not to say and that will be stored locally in the browser so that they no longer see the survey.
You mean that it is the only option to dismiss it? Will it be displayed on every page until an action is taken? If so, could a "skip" button or a close icon be added?
Which Norwegian are you targeting? Bokmål, Nynorsk or both? I haven't identified the language from the existing translations.
Good question: Bokmål given that it's the larger community so better supports the sample size that we would need.
Jul 16 2019
Please remember to not distribute figures about gender diversity without correcting for selection bias and other demographic distortions, cf. Hill & Shaw 2013 https://meta.wikimedia.org/wiki/Research:Gender_gap
Jul 15 2019
arwiki => [ "enabled" => true, "type" => "internal", "name" => "editor-gender-1-ar", "question" => "Editor-gender-1-message", "description" => "Editor-gender-1-description", "answers" => [ "Editor-gender-1-answer-man", "Editor-gender-1-answer-woman", "Editor-gender-1-answer-decline", ], "freeformTextLabel": "Editor-gender-1-free-form-text-label", "privacyPolicy" => "Editor-gender-1-privacy", "coverage" => 0.5, // 1 in 2 "audience" => [ "anons" => false, ], "platforms" => [ "desktop"=> ["stable"], "mobile"=> ["stable"] ], ]
nowiki => [ "enabled" => true, "type" => "internal", "name" => "editor-gender-1-no", "question" => "Editor-gender-1-message", "description" => "Editor-gender-1-description", "answers" => [ "Editor-gender-1-answer-man", "Editor-gender-1-answer-woman", "Editor-gender-1-answer-decline", ], "freeformTextLabel": "Editor-gender-1-free-form-text-label", "privacyPolicy" => "Editor-gender-1-privacy", "coverage" => 1, // all signed-in users "audience" => [ "anons" => false, ], "platforms" => [ "desktop"=> ["stable"], "mobile"=> ["stable"] ], ]
enwiki => [ "enabled" => true, "type" => "internal", "name" => "editor-gender-1-en", "question" => "Editor-gender-1-message", "description" => "Editor-gender-1-description", "answers" => [ "Editor-gender-1-answer-man", "Editor-gender-1-answer-woman", "Editor-gender-1-answer-decline", ], "freeformTextLabel": "Editor-gender-1-free-form-text-label", "privacyPolicy" => "Editor-gender-1-privacy", "coverage" => 0.1, // 1 in 10 "audience" => [ "anons" => false, ], "platforms" => [ "desktop"=> ["stable"], "mobile"=> ["stable"] ], ]
Jul 11 2019
While this is still desired and we may move forward with it (specifically around language switching), it's being put on hold for the foreseeable future while other tasks are worked out.
Jul 9 2019
Closing this task as I move to the analysis component (T212448). See the configuration subtask (T226273) for final count of responses from each language. See the meta talk page for more details on future survey rounds.
surveys finished deployment. description updated w/ final counts of survey responses for each language edition. note that these counts include responses under 18, which was between 10% (german) and 35% (hebrew) of all survey responses. respondents under 18 did not respond to other questions though, so they will not be included in any other analyses.
Jul 3 2019
Project information is maintained on https://research.wikimedia.org/ as well in brief in the meta FAQs: https://meta.wikimedia.org/wiki/Research:FAQ#What_does_the_Wikimedia_Research_team_do?_Can_it_support_my_team%E2%80%99s_data_analysis_needs?
I updated the nutshell template at the top of the page that indicated it was a legacy page to the historical template to make it more clear that this page is obsolete.
Jul 1 2019
Thanks @Aklapper -- I'll take this.