Fri, Oct 16
This statistic was mentioned in the Technology Department's Quarter in Review for Q4 of FY 19/20. Looking further, I found out that it comes from the Understanding Engagement with Images in Wikipedia research project. More detailed statistics can be found on the First Round of Analysis page, which I'll dig into further. Looks like T250154 is the parent task for this work.
Created subtasks for all five points, changing this to an epic and moving it to the Epics column on the Product Analytics board.
There's the MediaViewer schema, and there's data from it in the Data Lake. An investigation would be needed to understand what data is actually logged and whether that can answer this.
As far as I know, there is not any live instrumentation that would allow us to measure this. The SearchSatisfaction schema measures dwell time, but requires the user to reach a page through an on-wiki search, and we know that's not representative of how visitors reach us.
Based on my conversations with @cchen and @mpopov it looks like this will not be straightforward to do any time soon. If we're interested in understanding this based on existing edits we'll need to extract and process diffs between revisions.
I've previously discussed something similar with @jwang in relation to T247417. We can do this on a monthly basis by using the sqooped tables in wmf_raw in the Data Lake. We'll left join mediawiki_imagelinks twice, first with the mediawiki_page table to identify local files, second with mediawiki_page table to identify files used from Commons. If a file isn't found in either of those it should be redlink, and we can mark it as such.
I agree with @MNeisler that using the VisualEditorFeatureUse schema makes sense since we're asking questions about user behaviour around features in VE specifically.
Also, I think storing previous and current state of the filters is a great way to do it! Perhaps particularly if we switch to a map type for storing additional action parameters/values. The only other alternative I was going to suggest was having a combination of value and is_default fields (similar to how PrefUpdate does it), where is_default is true if the value is set back to whatever the default is, and false otherwise. Looking at it again, I think storing the previous and current state is a better option.
@egardner : Thanks for the updates and work so far. Thanks also for your patience while I work on getting feedback to you on this, I met with @mpopov last week and discussed a lot of things around this schema and should've relayed information to you sooner, sorry!
Tue, Oct 13
Fri, Oct 9
@Milimetric : It looks like there's no data in event_sanitized.prefupdate for 2020-09-19 through 2020-09-21, and it looks like there's partial data on 2020-09-22. Would it be possible to re-sanitize that date range, or will we need to wait for the re-sanitization script to stop by?
BTW, I came back to this because of T252391, and noticed that when looking at the two-year registration rate on Vietnamese it looks like the time period where we ran our Welcome Survey A/B test had substantially higher registration rates than expected. If we decide to run another experiment, we should consider fitting a time-series model to the data and use it to predict number of registrations in order to understand if registrations are outside what's expected.
@kostajh : Thanks for picking this up and pinging me about it. I think we should switch off EditorJourney since we're not actively using the data in any ongoing analysis.
@Milimetric : Not a problem, definitely understand that this would be a non-standard request! I've reached out to the PA team and will report back, probably some time on Tuesday.
@Milimetric : I inspected the sanitized data by looking at the event structs of random partitions and aggregating some random months across various years from 2017 onwards, and in all cases the sanitized data looks correct to me.
Wed, Oct 7
@mpopov : Thanks for your patience while I work on juggling tasks and finding time to come back to this. I've discussed the schemas with the SD team and we found that the MultimediaViewer and UploadWizard schemas could be marked for deprecation. As I didn't have edit permission of the googledoc, I left a couple of comments to that effect. I think this concludes everything, handing it to you for sign-off!
Tue, Oct 6
Mon, Oct 5
I've dug into this a bit to get an understanding of what data is available through the VisualEditorFeatureUse schema. I also met with @MNeisler on the Product Analytics team to get a check on whether my understanding of the data was correct, and it appears to be.
Thu, Oct 1
With these new upgrades happening, I wanted to move my Jupyter notebooks from stat1008 to stat1006 as stat1008 has been very busy lately. After rsync'ing my files, I started reinstalling my R libraries and had them error out because one of them wasn't available for R v3.3. That surprised me, because Debian Buster ships with R v3.5 (as can be found on stat1005 and stat1008).
This is awesome work so far! I've read through this task, its parent task, and the proposed patch and updated the measurement specification to reflect the set of questions mentioned by @CBogen in T263875#6495409. From what I can tell, the proposed schema allows us to answer our current set of questions.
Tue, Sep 29
@mpopov : Ah, feel free to reopen this if you want me to ping the SD team and have them come back to me with a list of schemas.
A huge thanks to @mpopov for doing a lot of work on this, improving the data processing code and figuring out ways massage the data from SearchSatisfaction to pull out the insights!
I've gone through the spreadsheet and added information for all known Growth-related schemas. Looks like the Multimedia team already went through and marked theirs as well. Don't think this needs any peer review, so closing it as resolved.
Thu, Sep 24
We're unsure if the finding is trustworthy. I'm moving this back to "Doing" to dig further into this.
Wed, Sep 23
The analysis has been done and can be found in this Jupyter/R notebook. We find a slight preference for the control condition (legacy search) over Media Search.
@cchen : Thanks for the ping, I've confirmed none of the teams I work with use pageview data in reports and updated the task description to reflect this.
Closing this as resolved, but feel free to reopen this if I misinterpreted something and there are still unanswered questions.
Commenting here because I'd be curious to know if we have other sources we'd use for this: For #1, using the edit counts for all Wikipedias on wikistats, it looks like we're pretty consistently in the 15–16M range now (I see March–May as COVID-related outliers). So I'd suggest August 2020 is a reasonable estimate for activity. With 15,927,411 edits in 31 days in August, and assuming the normal 86400 seconds/day, we get 5.95 edits per second to all Wikipedias.
Mon, Sep 21
I've verified that this change has been deployed. The NewcomerTask schema is available in a sanitized version, and the changes to the HelpPanel schema are also in the sanitized data for that schema. I've not verified that the tokens are no longer hashed, but we can do that when we update the reporting notebook.
Sep 14 2020
Hi @Rmaung! We'll review and triage this request at our next board refinement meeting on Tuesday, September 15.
Sep 10 2020
Sep 9 2020
Moving this out of "Doing" as we've discovered that the data gathering had a bug leading to us being unable to determine which algorithm produced a clicked result when interleaving occurred. Will pick up the analysis again once the second iteration of the test has been completed. And yes, we'll QA the data after relaunch to make sure it's working correctly.
Sep 8 2020
@Milimetric : I've gone through the various subtasks and changes we made before the tracking list was implemented and not found any cause for concern. We've got the properties we want to track listed and that covers all current analysis needs. So applying the list to old data should be fine.
Aug 25 2020
Aug 24 2020
@CBogen : the number of full-text searches does not include any autocomplete searches. It does include full-text searches that originate from an autocomplete search (e.g. the user clicks on the "contains …" part, or hits enter) because identifying those to separate them out is tricky.
Patch uploaded, moving to the review column for @mpopov to review.
Aug 21 2020
One question came up when discussing this task in the Growth team: are we seeing this pattern for user preferences from other extensions besides GrowthExperiments?
Aug 20 2020
Aug 18 2020
Adding my support to having this tool available! As I've been working with the Structured Data team to determine what metrics we want to use to measure the impact of upcoming tests, it's become more and more clear to me that what we're doing is generally what the Discovery team were doing a couple of years ago. The hewiki report created by the tool that Mikhail linked in the descriptions contains a lot of the metrics we've been discussing with the SD team (as well as a lot of additional ones). The tool also analyzes interleaved A/B tests, something the team are planning on doing. Having all of that readily available to enable iterating on experiments with streamlined analysis would take a lot of work out of it!
Aug 17 2020
A first pass on calculating these baselines has now been done. The numbers and the calculations can be found in this Jupyter notebook. It uses the past 7 days as the source of the data and was run on 2020-08-16, meaning it reflects the week from 2020-08-09 through 2020-08-15. It does make some assumptions and shortcuts, and I'm happy to discuss those and modify the code as we see fit.
We've not seen any indication that there are data quality issues after our deployment, and we do not have the capacity to prioritize this work to dig in further. Closing this task as resolved since the Help Panel has been deployed.
We've not seen any indication that there are data quality issues after our deployment, and we do not have the capacity to prioritize this work to dig in further. Closing this task as resolved since the Homepage has been deployed.
We've not seen any indication that there are data quality issues after our deployment on Basque Wikipedia, and we do not have the capacity to prioritize this work to dig in further. Closing this task as resolved since the Homepage has been deployed.
We've not seen any indication that there are data quality issues after our deployment on Basque Wikipedia, and we do not have the capacity to prioritize this work to dig in further. Closing this task as resolved since the Help Panel has been deployed.
Aug 13 2020
Yeah, I think the descriptions and clarifications on the wikitech page are great, nice work!
Aug 12 2020
As mentioned, I also left a comment in Gerrit. I see that the descriptions on Wikitech provide more information than the ones in the repository, which makes sense to me. Moving this to the review column.
Initial analysis done, awaiting review and possibly follow-up questions from Legal.
Aug 10 2020
I ran SHOW PARTITIONS event_sanitized.homepagemodule on 2020-08-07 and again today (2020-08-10). After the first run, I compared the available partitions to the date specified in T244312 (2019-11-05) and noticed that the first available partition on 2020-08-07 was 2019-11-11, indicating that data was starting to be purged.
Thanks for taking care of this so quickly, @Ottomata, very much appreciated!
Aug 7 2020
I don't have any objections to removing the question. As @RHo points out, we have the mentor module. Users in the control group have it too, if they turn the Homepage on. We've also analyzed the answers to this question a few times and gotten a good sense of how popular that choice is, so I'm not sure we can learn that much more.
EventLogging on the beta cluster appears to be broken enough that it's near impossible to verify events there, unfortunately. From what I could tell by checking the JS console in my browser, link-click events on desktop trigger an event, but I didn't see any of those on mobile.