Page MenuHomePhabricator

mpopov (Mikhail Popov)
Data Analyst

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Jul 27 2015, 4:15 PM (199 w, 2 d)
Availability
Available
IRC Nick
bearloga
LDAP User
Bearloga
MediaWiki User
MPopov (WMF) [ Global Accounts ]

Data Analyst in Reading (formerly of Discovery) | User:MPopov (WMF) | Highlighted Works

Recent Activity

Fri, May 17

mpopov updated subscribers of T223496: Requesting access to machines [stat1004, stat1005 (now stat1007), and stat1006] and groups for iflorez.

We need manager sign-off

Fri, May 17, 4:55 PM · Patch-For-Review, Operations, SRE-Access-Requests

Tue, May 14

mpopov updated the task description for T220542: Update R from 3.3.3 to 3.5.3 on stat and notebook machines.
Tue, May 14, 5:46 PM · Analytics, Product-Analytics
mpopov added a comment to T222933: Upgrade R in SWAP notebooks to 3.4+.

@Groceryheist: https://meta.wikimedia.org/wiki/User:MPopov_(WMF)/Notes/RStan

Tue, May 14, 5:30 PM · Analytics-SWAP, Analytics
mpopov added a comment to T222933: Upgrade R in SWAP notebooks to 3.4+.

By the way, there's already a task for this: T220542

Tue, May 14, 4:20 PM · Analytics-SWAP, Analytics

Fri, May 10

mpopov closed T222897: Add revert rate to Suggested Edits report as Resolved.

Done

Fri, May 10, 1:33 PM · Product-Analytics

Thu, May 9

mpopov moved T209891: Analyze results of sameAs A/B test from Backlog to Doing on the Product-Analytics board.

I am currently doing an analysis of how sameAs impacted other projects.

Thu, May 9, 3:14 PM · Product-Analytics, SEO
mpopov moved T222897: Add revert rate to Suggested Edits report from Triage to Doing on the Product-Analytics board.
Thu, May 9, 3:14 PM · Product-Analytics
mpopov created T222897: Add revert rate to Suggested Edits report.
Thu, May 9, 3:13 PM · Product-Analytics
mpopov created T222895: "TSocket read 0 bytes" error in Hue when querying.
Thu, May 9, 3:05 PM · Analytics

Fri, May 3

mpopov awarded T221890: Add wikidata ids to data lake tables a Like token.
Fri, May 3, 2:34 PM · User-Elukey, Epic, Analytics, Product-Analytics

Apr 15 2019

mpopov added a comment to T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory.

I just deleted product-analytics-test and product-analytics-bayes so y'all don't need to worry about those instances :)

Apr 15 2019, 3:21 PM · Operations, ops-eqiad, DC-Ops, User-Zppix, cloud-services-team (Kanban)

Apr 10 2019

mpopov moved T220628: Add "unique clicks" counter in Suggested Edits analytics instrumentation from Triage to Tracking on the Product-Analytics board.
Apr 10 2019, 4:58 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Product-Analytics
mpopov created T220628: Add "unique clicks" counter in Suggested Edits analytics instrumentation.
Apr 10 2019, 4:23 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Product-Analytics

Apr 9 2019

mpopov moved T220542: Update R from 3.3.3 to 3.5.3 on stat and notebook machines from Triage to Tracking on the Product-Analytics board.
Apr 9 2019, 7:45 PM · Analytics, Product-Analytics
mpopov updated the task description for T220542: Update R from 3.3.3 to 3.5.3 on stat and notebook machines.
Apr 9 2019, 7:35 PM · Analytics, Product-Analytics
mpopov created T220542: Update R from 3.3.3 to 3.5.3 on stat and notebook machines.
Apr 9 2019, 7:35 PM · Analytics, Product-Analytics

Apr 3 2019

mpopov committed R1821:72c2e393525a: Add options to run hive through nice and/or ionice (authored by mpopov).
Add options to run hive through nice and/or ionice
Apr 3 2019, 10:10 PM
mpopov committed R1821:c4c25ab9babd: Add options to run hive through nice and/or ionice (authored by mpopov).
Add options to run hive through nice and/or ionice
Apr 3 2019, 10:10 PM
mpopov moved T216055: Move backend for current search dashboard to pull data from Hadoop from in progress to Needs review on the Discovery-Search (Current work) board.

@chelsyx: Okay, I think that's all the reports now. https://gerrit.wikimedia.org/r/#/c/wikimedia/discovery/golden/+/499939/-1..2 awaits your CR

Apr 3 2019, 9:44 PM · Discovery-Search (Current work), Patch-For-Review, Product-Analytics, Epic

Apr 1 2019

Niedzielski awarded T203498: Upgrade Hive to ≥ 2.0 a Love token.
Apr 1 2019, 3:52 PM · Product-Analytics, Analytics-Cluster, Analytics

Mar 28 2019

mpopov moved T216055: Move backend for current search dashboard to pull data from Hadoop from Next Up to Doing on the Product-Analytics board.
Mar 28 2019, 10:46 PM · Discovery-Search (Current work), Patch-For-Review, Product-Analytics, Epic
mpopov closed T213458: Analyse Android notifications release as Resolved.

It would appear we may have on average? Report up at: https://www.mediawiki.org/wiki/User:MPopov_(WMF)/Android/Notifications

Mar 28 2019, 2:44 PM · Product-Analytics

Mar 25 2019

mpopov moved T216055: Move backend for current search dashboard to pull data from Hadoop from Backlog to Next Up on the Product-Analytics board.

I'm currently finishing up an important analysis for the Android team (T213458) and once that's done (today or tomorrow) I will resume work on this. I've already started but have done a little bit. Based on that, I expect to be done with this by the end of the week.

Mar 25 2019, 5:34 PM · Discovery-Search (Current work), Patch-For-Review, Product-Analytics, Epic

Mar 19 2019

mpopov updated subscribers of T218703: Update Android account creation analytics.

Question for @Dbrant: would the app be able to store data in user_properties table? Might need to be a separate API call after an account is created since we probably can't include that data in the account creation API call. Would rather do that than store a user_id in event data.

Mar 19 2019, 4:47 PM · Wikipedia-Android-App-Backlog, Product-Analytics
mpopov created T218703: Update Android account creation analytics.
Mar 19 2019, 4:39 PM · Wikipedia-Android-App-Backlog, Product-Analytics
mpopov updated subscribers of T218594: Instrument MobileWikiAppSuggestedEdits.

@Dbrant: let's finally get rid of 1:100 sampling in the sessions funnel?

Mar 19 2019, 2:58 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov updated the task description for T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 19 2019, 2:57 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov updated the task description for T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 19 2019, 2:46 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov updated the task description for T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 19 2019, 2:38 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics

Mar 18 2019

mpopov raised the priority of T218595: Mark descriptions made through Suggested Edits from High to Needs Triage.
Mar 18 2019, 4:27 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.27x-L-Lamington), Product-Analytics
mpopov updated the task description for T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 18 2019, 4:26 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov updated the task description for T218595: Mark descriptions made through Suggested Edits.
Mar 18 2019, 4:18 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.27x-L-Lamington), Product-Analytics
mpopov updated the task description for T218595: Mark descriptions made through Suggested Edits.
Mar 18 2019, 4:17 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.27x-L-Lamington), Product-Analytics
mpopov created T218595: Mark descriptions made through Suggested Edits.
Mar 18 2019, 4:11 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.27x-L-Lamington), Product-Analytics
mpopov updated the task description for T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 18 2019, 4:07 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov moved T218594: Instrument MobileWikiAppSuggestedEdits from Triage to Tracking on the Product-Analytics board.
Mar 18 2019, 4:06 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov created T218594: Instrument MobileWikiAppSuggestedEdits.
Mar 18 2019, 4:04 PM · Wikipedia-Android-App-Backlog (Android-app-release-v2.7.28x-M-Mochi), Patch-For-Review, Product-Analytics
mpopov added a comment to T213460: Analytics schema for edit action feed.

After talking with Chelsy and thinking even more about this over the weekend I'm gonna simplify the specs. Once Robin returns I'll talk with him about data he'd be interested in with respect to user experience flow to aid future redesigns/adjustments.

Mar 18 2019, 2:08 PM · Product-Analytics

Mar 14 2019

mpopov closed T197896: Make various auth libraries available on stat* machines as Resolved.

Managed with others! :) I'll reopen if needed. Thanks for checking in!

Mar 14 2019, 5:42 PM · Patch-For-Review, Product-Analytics, Analytics, SEO
mpopov closed T197896: Make various auth libraries available on stat* machines, a subtask of T172581: [EPIC] Set up mechanism for archiving Google Search Console data, as Resolved.
Mar 14 2019, 5:42 PM · Epic, Product-Analytics, SEO

Mar 13 2019

mpopov added a comment to T213460: Analytics schema for edit action feed.

Measure the length of sessions and volume of edits made within the different app editing tasks. (We want to know how many cards deep into the feed people look vs. how many edits they actually make, to see whether the stuff we're presenting is compelling to people. Also interested if people do this for long at a time to see how compelling they find the stuff.)

Mar 13 2019, 5:47 PM · Product-Analytics

Mar 11 2019

mpopov updated subscribers of T213458: Analyse Android notifications release.

Thank you, @Catrope!!!

Mar 11 2019, 2:26 PM · Product-Analytics

Mar 7 2019

mpopov added a comment to T209891: Analyze results of sameAs A/B test.

Draft posted at: https://www.mediawiki.org/wiki/User:MPopov_(WMF)/SEO/sameAs_test

Mar 7 2019, 4:57 PM · Product-Analytics, SEO

Mar 1 2019

mpopov updated subscribers of T213458: Analyse Android notifications release.
Mar 1 2019, 9:57 PM · Product-Analytics
mpopov added a comment to T213458: Analyse Android notifications release.

Okay, Roan confirmed for me that's the correct interpretation.

Mar 1 2019, 9:50 PM · Product-Analytics

Feb 28 2019

mpopov closed T215819: App impressions for 2017 and 2018 as Resolved.

Hi, @jrobell! On Android: 40711 impressions in Dec 2017, 35882 impressions in Dec 2018 – a decrease of 11.8%

Feb 28 2019, 10:13 PM · Product-Analytics
mpopov added a comment to T209891: Analyze results of sameAs A/B test.

I've identified a few potential issues with the query I've written for the past check-ins so I'm working on resolving that to make sure the analysis is performed on vetted, correct data. (Gotta love those joins of partitioned tables in Hive.)

Feb 28 2019, 4:15 PM · Product-Analytics, SEO

Feb 27 2019

mpopov updated subscribers of T213458: Analyse Android notifications release.

I've come across a potential block and I would like some clarification. For context:

Feb 27 2019, 6:48 PM · Product-Analytics

Feb 20 2019

mpopov added a comment to T215819: App impressions for 2017 and 2018 .

If this is about the fund raising banner impression in the app, I think @JoeWalsh has done some calculations for Dec 2018. But I don't think we have any data back to Dec 2017, @mpopov do you know?

Feb 20 2019, 8:15 PM · Product-Analytics
mpopov added a comment to T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard.

Tried using analytics-mysql on stat1007 and got "permission denied". Follow-up question: will it be made available on SWAP?

Feb 20 2019, 2:15 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research

Feb 19 2019

mpopov updated the image for Product-Analytics from F24163962: profile to F28247801: profile.
Feb 19 2019, 10:04 PM
mpopov moved T213458: Analyse Android notifications release from Backlog to Doing on the Product-Analytics board.

Working on acquiring data for this and it turned out to be much, much harder and more involved than I anticipated. The analytics data we get from the app just has the notification ID, so I'm in the process of getting the Echo extension tables into Hive (that's the part that's causing problems & delays) so that I can get editing activity for users who got the notifications on Android.

Feb 19 2019, 9:11 PM · Product-Analytics
mpopov added a comment to T211366: Android navigation refresh - understand impact on user engagement metrics.

@Charlotte: thanks for the ping! Right now my priorities are: the notifications analysis, SEO sameAs analysis, some Search query migration, and then this. I'm working on acquiring data for T213458 and it turned out to be much, much harder than I anticipated so that's creating some delays.

Feb 19 2019, 9:08 PM · Product-Analytics
mpopov added a comment to T213460: Analytics schema for edit action feed.

@Charlotte: thanks for the ping! I'm working on acquiring data for T213458 and it turned out to be much, much harder than I anticipated.

Feb 19 2019, 9:04 PM · Product-Analytics
mpopov committed R1821:153bd282525d: Add support for querying x1 (authored by mpopov).
Add support for querying x1
Feb 19 2019, 7:16 PM
mpopov added a comment to T172410: Replace the current multisource analytics-store setup.

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

The echo tables are on x1. x1 is a separate instance in production, and hence on the new dbstore model.
They are available on the old analytics because it has all the instances mixed.

Feb 19 2019, 5:18 PM · Product-Analytics, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
mpopov added a comment to T172410: Replace the current multisource analytics-store setup.

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

Feb 19 2019, 4:27 PM · Product-Analytics, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
mpopov added a comment to T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard.

Sure, we can definitely work on a shared sqoop wrapper

I don't even mean a sqoop wrapper! Just I think the script should be able to output the proper hostname:port, maybe in a 'dry-run'n mode, rather than always connecting via mysql CLI.

analytics-mysql --output (or --dry-run?) enwiki

Would just output the hostname:port for easy use with other tools.

Didn't read it carefully, this seems a great idea, going to work on it asap!

Feb 19 2019, 4:03 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research

Feb 15 2019

mpopov committed R1821:fc9da5faba11: Update README.md (authored by mpopov).
Update README.md
Feb 15 2019, 4:38 PM
mpopov added a comment to T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard.

@elukey: is there a recommendation for how to sqoop with the shards? since a shell command would look like:

Feb 15 2019, 3:48 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research

Feb 14 2019

mpopov committed R1821:a85102e48e68: Update for sharding setup (authored by mpopov).
Update for sharding setup
Feb 14 2019, 8:25 PM
mpopov moved T216055: Move backend for current search dashboard to pull data from Hadoop from Triage to Backlog on the Product-Analytics board.
Feb 14 2019, 7:27 PM · Discovery-Search (Current work), Patch-For-Review, Product-Analytics, Epic
mpopov committed R1821:fa6def7b9934: Update for sharding setup (authored by mpopov).
Update for sharding setup
Feb 14 2019, 3:42 PM

Feb 13 2019

mpopov added a comment to T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard.

@jcrespo: is it safe to assume that the current config of s3 (default) will stay that way? and if not, can I assume that the shard which is designated as the default one will have "(default)" in the comment?

Feb 13 2019, 4:31 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
mpopov added a comment to T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard.

Important, dblists are not the canonical place for database distribution (while it tries to be in sync). The canonical method is the array at the sectionsByDB array:

https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/db-eqiad.php;c178d2ccf57d28434c8acf06b5b793125ff25e1b$28?as=source&blame=off

Feb 13 2019, 3:34 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
mpopov reopened T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard as "Open".

Actually, I would like to request for https://github.com/wikimedia/operations-mediawiki-config/tree/master/dblists to have a single file I can download which has a mapping.

Feb 13 2019, 2:59 PM · Product-Analytics, Patch-For-Review, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research
mpopov reopened T212386: Provide tools for querying MediaWiki replica databases without having to specify the shard, a subtask of T172410: Replace the current multisource analytics-store setup, as Open.
Feb 13 2019, 2:59 PM · Product-Analytics, Analytics, WMDE-Analytics-Engineering, User-Addshore, User-Elukey, Research

Feb 11 2019

mpopov added a comment to T215803: Messed up layout.

Oh, cool, thanks!

Feb 11 2019, 3:16 PM · Android-app-Bugs, Wikipedia-Android-App-Backlog
mpopov created T215803: Messed up layout.
Feb 11 2019, 3:05 PM · Android-app-Bugs, Wikipedia-Android-App-Backlog

Feb 8 2019

mpopov added a comment to T214666: [BUG] Offline articles get deleted after a few days.

More reports: https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom;TicketID=10982794

Feb 8 2019, 10:12 PM · Wikipedia-Android-App-Backlog, Android-app-Bugs

Feb 7 2019

mpopov added a comment to T209720: Determine impact of sitemaps on search traffic to Indonesian, Portuguese, Punjabi, Dutch, and Korean Wikipedias.

Query & scripts: https://github.com/wikimedia-research/SEO-Experiment-Sitemaps

Feb 7 2019, 8:14 PM · Product-Analytics, SEO
mpopov moved T209720: Determine impact of sitemaps on search traffic to Indonesian, Portuguese, Punjabi, Dutch, and Korean Wikipedias from Next Up to Doing on the Product-Analytics board.

Just like with sameAs (T211191#4885005), there is no visible change in traffic due to the intervention:

Feb 7 2019, 4:29 PM · Product-Analytics, SEO

Feb 6 2019

mpopov committed R1821:95fb6e329c17: Add YMD component extraction (authored by mpopov).
Add YMD component extraction
Feb 6 2019, 6:36 PM
mpopov triaged T215174: Check instrumentation for Android account creation as Normal priority.
Feb 6 2019, 4:21 PM · Product-Analytics
mpopov claimed T215174: Check instrumentation for Android account creation.
  1. Can we see whether people have made accounts through the app?
Feb 6 2019, 4:20 PM · Product-Analytics

Jan 25 2019

mpopov moved T214129: Provide Product Analytics input on Modern Event Platform schema conventions from Triage to Doing on the Product-Analytics board.
Jan 25 2019, 6:14 PM · Product-Analytics
mpopov claimed T214129: Provide Product Analytics input on Modern Event Platform schema conventions.
Jan 25 2019, 6:12 PM · Product-Analytics
mpopov added a comment to T213702: Check home leftovers of user imarlier (Ian Marlier) .

The folks who (may) still work on sitemaps might be interested in that data, though. @ovasileva @mpopov ?

Jan 25 2019, 3:10 PM · Analytics-Kanban, Analytics

Jan 24 2019

mpopov added a comment to T211366: Android navigation refresh - understand impact on user engagement metrics.

@Charlotte @mpopov

  1. i saw in the ticket about the bug that hundreds of duplicate events were being sent.
    1. If we can detect them is there a way to correct for them? and
    2. if there are hundreds, what % of total events is that? The data doesn't have to be perfect, so if it's less than 5% of events of any given event-type, then I wouldn't let it get in the way of analysis.
Jan 24 2019, 9:06 PM · Product-Analytics
mpopov added a comment to T211841: Update Audiences page and Key Product Metrics deck with December 2018 Readers data.

Done

Jan 24 2019, 4:25 PM · Product-Analytics
mpopov updated the task description for T211841: Update Audiences page and Key Product Metrics deck with December 2018 Readers data.
Jan 24 2019, 4:25 PM · Product-Analytics

Jan 23 2019

mpopov claimed T202664: [EPIC] Count unique iOS & Android users precisely and in a privacy conscious manner that does not require opt in to send data.
Jan 23 2019, 9:39 PM · Epic, Wikipedia-Android-App-Backlog, Wikipedia-iOS-App-Backlog, Product-Analytics
mpopov updated subscribers of T202664: [EPIC] Count unique iOS & Android users precisely and in a privacy conscious manner that does not require opt in to send data.

@JMinor @Charlotte: after speaking with @kzimmerman we decided I should manage this project and that it has priority (at least on our team). I'll follow up with you on next steps.

Jan 23 2019, 9:38 PM · Epic, Wikipedia-Android-App-Backlog, Wikipedia-iOS-App-Backlog, Product-Analytics
mpopov renamed T202664: [EPIC] Count unique iOS & Android users precisely and in a privacy conscious manner that does not require opt in to send data from Calculate precisely number of unqiue users for IOS and Android in a privacy conscious manner that does not require opt in to send data to [EPIC] Count unique iOS & Android users precisely and in a privacy conscious manner that does not require opt in to send data.
Jan 23 2019, 9:35 PM · Epic, Wikipedia-Android-App-Backlog, Wikipedia-iOS-App-Backlog, Product-Analytics
mpopov added a comment to T210687: Bug: can't make a YoY time series chart in Superset.

Cool! Thank you!

Jan 23 2019, 7:54 PM · Analytics-Kanban, Product-Analytics, Analytics
mpopov added a comment to T211840: Update Audiences page and Key Product Metrics with November 2018 Readers data.

Done

Jan 23 2019, 5:13 PM · Product-Analytics
mpopov updated the task description for T211840: Update Audiences page and Key Product Metrics with November 2018 Readers data.
Jan 23 2019, 5:13 PM · Product-Analytics
mpopov moved T214490: page_creation_timestamp not always correct in mediawiki_history from Triage to Tracking on the Product-Analytics board.
Jan 23 2019, 4:52 PM · Analytics-Kanban, Product-Analytics, Analytics-Data-Quality, Analytics
mpopov created T214490: page_creation_timestamp not always correct in mediawiki_history.
Jan 23 2019, 4:52 PM · Analytics-Kanban, Product-Analytics, Analytics-Data-Quality, Analytics
mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

@Abit @Ramsey-WMF in addition to T213597#4900741, here's the history of that metric with a 7-day rolling average to smooth the daily data a bit:

Jan 23 2019, 3:11 PM · SDC General, Product-Analytics
mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

True, but its revisions do have revision_is_deleted set, so you've already filtered them out of your query.

Jan 23 2019, 2:54 PM · SDC General, Product-Analytics

Jan 22 2019

mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

Okay, here are the numbers which were calculated with the following conditions:

Jan 22 2019, 10:35 PM · SDC General, Product-Analytics
mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

I noticed once big thing: it seems like your counts of file page edits (n_edits_total, n_additions_total, etc.) include the initial edit that creates the pages, so in the end you're getting the proportion of files which have metadata added in the first 2 months, including during the initial upload.

I tried excluding those initial creations (event_timestamp != page_creation_timestamp), and it looks like the proportion goes from 99% to 50%.

Jan 22 2019, 5:27 PM · SDC General, Product-Analytics

Jan 18 2019

mpopov updated subscribers of T213597: [REQUEST] Baselines for structured data on Commons.
Jan 18 2019, 10:17 PM · SDC General, Product-Analytics

Jan 17 2019

mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

Thanks for clarifying! Okay, one more question for @Abit & @Ramsey-WMF just so everyone is on the same page. The statistic you want is: the % of all uploaded files which have had additions to their pages in the first 2 months after upload.

Jan 17 2019, 5:38 PM · SDC General, Product-Analytics

Jan 16 2019

mpopov closed T211191: Check in sameAs A/B test results as Resolved.

Just updated my database of sampled pages (using December 2018 snapshot) and recounted pageviews from 2018-11-01 to 2019-01-15 (code & data over at GitHub). There has not been any change, up or down since the rollout:

Jan 16 2019, 4:40 PM · Readers-Web-Backlog (Tracking), Product-Analytics, SEO
mpopov closed T211191: Check in sameAs A/B test results, a subtask of T209891: Analyze results of sameAs A/B test, as Resolved.
Jan 16 2019, 4:40 PM · Product-Analytics, SEO
mpopov moved T211191: Check in sameAs A/B test results from Next Up to Doing on the Product-Analytics board.
Jan 16 2019, 4:36 PM · Readers-Web-Backlog (Tracking), Product-Analytics, SEO
mpopov added a comment to T213597: [REQUEST] Baselines for structured data on Commons.

@Ramsey-WMF @Abit: hi, I would like to clarify what "metadata" includes. Here's my initial list:

Jan 16 2019, 3:22 PM · SDC General, Product-Analytics

Jan 15 2019

mpopov moved T211191: Check in sameAs A/B test results from Backlog to Next Up on the Product-Analytics board.
Jan 15 2019, 7:16 PM · Readers-Web-Backlog (Tracking), Product-Analytics, SEO