Tbayer (Tilman Bayer)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Oct 20 2014, 11:21 PM (225 w, 6 d)
Availability
Available
IRC Nick
HaeB
LDAP User
Unknown
MediaWiki User
Tbayer (WMF) [ Global Accounts ]

Recent Activity

Today

Tbayer added a comment to T201123: What % of pages feature issues?.

To wrap this up, I extended the above queries for all Wikipedias (using a PAWS notebook).

Mon, Feb 18, 4:43 AM · Product-Analytics, Readers-Web-Backlog (Tracking), Reading-analysis, Page-Issue-Warnings
Tbayer added a comment to T193051: Remove all page previews instrumentation code.

Thanks @Tbayer! I had a look now and I am wondering if the data for page previews is still being tracked

  • The Popups schema is no longer collecting data, although you should be able to reactivate it with a simple configuration change, as we haven't yet removed the instrumentation code (this task).
  • The less detailed VirtualPageView schema is still sending data. Its main purpose is to provide aggregated content consumption numbers (how often a given page has been previewed for >1sec) that are stored in the Virtualpageview_hourly table - rather than answering product questions about how the previews feature is being used per se; we used the Popups schema for that.

/ if we still have data from previous tracking endevours that I could look at concerning the questions outlined in T214493?

Yes, there is still data in the usual places where EventLogging data is stored, e.g. the event.popups Hive table. (I guess you may already have looked at the results published at https://www.mediawiki.org/wiki/Page_Previews/2017-18_A/B_Tests and perhaps the further details in the Phab task(s) linked from there.)

Mon, Feb 18, 2:15 AM · Page-Previews, Readers-Web-Backlog

Yesterday

Tbayer awarded T195030: Develop availability metrics for PAWS a Like token.
Sun, Feb 17, 3:04 AM · cloud-services-team, Patch-For-Review, PAWS (JupyterHub 0.9)

Sat, Feb 16

Tbayer closed T216257: Most visited domains (pageviews) across all Wikipedia/Wikimedia as Resolved.

Yes, that should be a separate task (and may require involvement from other teams) .

Sat, Feb 16, 12:15 AM · Product-Analytics

Fri, Feb 15

Tbayer moved T216257: Most visited domains (pageviews) across all Wikipedia/Wikimedia from Triage to Doing on the Product-Analytics board.
Fri, Feb 15, 11:52 PM · Product-Analytics
Tbayer moved T215976: Data Dictionary for Core Metrics from Triage to Doing on the Product-Analytics board.
Fri, Feb 15, 11:52 PM · Product-Analytics, Better Use Of Data
Tbayer added a comment to T216257: Most visited domains (pageviews) across all Wikipedia/Wikimedia.

Here is a first result: the top 15 by pageviews for January 2019, with known bots/spiders excluded. (To get the domain, combine project and access method - e.g. "it.wikipedia" "mobile web" means it.m.wikipedia.org, "en.wikipedia" "desktop" means en.wikipedia.org.)

Fri, Feb 15, 11:45 PM · Product-Analytics
Tbayer added a comment to T216257: Most visited domains (pageviews) across all Wikipedia/Wikimedia.

However, it seems to be missing a few domains, like when I query for blog.wikimedia.org (or phabricator.wikimedia.org)

As mentioned (admittedly somewhat obliquely) on the documentation page linked in my email, the pageview data is limited to "production sites", which currently does not include blog.wikimedia.org and phabricator.wikimedia.org. There is some traffic data for both domains in other places, but we can be pretty certain already that neither of them are in the top 15 domains by pageviews, so it's probably not worth retrieving number for these for this purpose.

Fri, Feb 15, 11:36 PM · Product-Analytics
Tbayer added a project to T216208: ToolsDB overload and cleanup: PAWS.
Fri, Feb 15, 1:05 AM · Patch-For-Review, TCB-Team, Phragile, Data-Services, PAWS, cloud-services-team (Kanban)

Thu, Feb 14

Tbayer added a comment to T211827: Request: Top articles of 2018 on all Wikipedias.

I guess this task can be closed now?

Thu, Feb 14, 10:41 PM · Product-Analytics, Reading-analysis
Tbayer updated subscribers of T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.

Great, thanks a lot! The sample fields were introduced in September, so no need to go further back. (CC @Groceryheist )

Thu, Feb 14, 8:34 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer moved T212961: Add X-Analytics tag for AMC webrequests from Triage to Tracking on the Product-Analytics board.
Thu, Feb 14, 7:40 PM · Product-Analytics, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer added a project to T212961: Add X-Analytics tag for AMC webrequests: Product-Analytics.
Thu, Feb 14, 7:39 PM · Product-Analytics, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer claimed T214935: Examine clickthrough ratios for different page elements of action=history pages.
Thu, Feb 14, 7:33 PM · Readers-Web-Backlog, Product-Analytics
Tbayer moved T214935: Examine clickthrough ratios for different page elements of action=history pages from Triage to Doing on the Product-Analytics board.
Thu, Feb 14, 7:32 PM · Readers-Web-Backlog, Product-Analytics
Tbayer updated subscribers of T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.

Blocked on code review and an answer to T216096#4953210 from someone familiar with the whole EL pipeline and the purging mechanism (@mforns?).

Thu, Feb 14, 7:31 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer moved T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema from Triage to Blocked on the Product-Analytics board.
Thu, Feb 14, 7:29 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer updated subscribers of T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.

@Jdrewniak points out that in https://github.com/wikimedia/mediawiki-skins-MinervaNeue/blob/f07985c6dee5106da8f381a47214e7349fcd147e/resources/skins.minerva.scripts/pageIssuesLogger.js#L65 the spelling is still page-issues-b_sample/ page-issues-a_sample (i.e. like on the schema page, not like in Hive).

Thu, Feb 14, 9:06 AM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer added a comment to T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.

NB: The names of these sample field names are spelled with underscores in Hive (e.g. page_issues_b_sample, see below) but with dashes in the schema page (e.g. page-issues-b_sample ). Which version does the whitelist require?

Thu, Feb 14, 4:10 AM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer added a comment to T208795: Measure Google Translate Pageview Impact.

See now also T215093: Aggregate and save the Google Translate Pageview count temporarily

Thu, Feb 14, 3:06 AM · Patch-For-Review, Reading-Admin
Tbayer added a comment to T215093: Aggregate and save the Google Translate Pageview count temporarily.

Will it be possible later to backfill/update/extend either virtualpageview_hourly or pageview_hourly with data derived from this table? (cf. T212414#4864672)

Thu, Feb 14, 3:05 AM · Product-Analytics
Tbayer updated subscribers of T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.

PS: patch is at https://gerrit.wikimedia.org/r/490514 (seems @gerritbot is lagging a bit currently)

Thu, Feb 14, 2:43 AM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics
Tbayer added a comment to T209051: ReadingDepth schema is whitelisting both session ids and page ids.

It looks like we had forgotten to whitelist the actual pageID field in addition to the page title, probably because it was only introduced shortly after this task was created (it's in the current version of the schema page but not yet deployed). I should have caught that before +2ing Nuria's patch. I submitted a fix as part of 209051, also for the related revision ID field.

Thu, Feb 14, 1:17 AM · Patch-For-Review, Analytics
Tbayer added a comment to T209087: [EventLogging Sanitization] Update EL sanitization white-list for field renames in EL schemas.

Found (and fixed) an oversight regarding ReadingDepth: T216096

Thu, Feb 14, 1:08 AM · Patch-For-Review, Product-Analytics, Reading-analysis, Analytics
Tbayer created T216096: Whitelist sample flags and page/rev ID fields for ReadingDepth schema.
Thu, Feb 14, 12:52 AM · Readers-Web-Backlog (Tracking), Product-Analytics, Analytics

Wed, Feb 13

Tbayer added a comment to T216063: [Bug] Many ReadingDepth validation errors logged.

In case it's useful, keep in mind that it's possible to query the webrequest table for more detail on the event in question:

Wed, Feb 13, 10:49 PM · Analytics, Readers-Web-Backlog
Tbayer updated subscribers of T131280: Make aggregate data on editors per country per wiki publicly available.
Wed, Feb 13, 12:53 AM · Product-Analytics, Analytics-Kanban
Tbayer updated the task description for T215976: Data Dictionary for Core Metrics.
Wed, Feb 13, 12:31 AM · Product-Analytics, Better Use Of Data

Tue, Feb 12

Tbayer added a watcher for Better Use Of Data: Tbayer.
Tue, Feb 12, 11:14 PM
Tbayer added a comment to T214721: Create #pageviews-anomaly tag.

Please let us know in case any additional information is needed; otherwise it would be great to be able to use this tag soon.

Tue, Feb 12, 10:54 PM · Product-Analytics, Project-Admins
Tbayer added a comment to T214444: Update ReadingDepth instrumentation to avoid deprecated schema module (blocks loads event).

I understand this shouldn't have any impact on the logged events and their data (except maybe as consequence of the performance improvement in general), but please flag it in case that assumption turns out to be wrong.

Tue, Feb 12, 8:36 PM · MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Performance-Team (Radar)
Tbayer added a comment to T215379: Restoring the daily traffic anomaly reports.

No problem! Thanks for clarifying :)

Even given that the report was just based on two users' cronjobs though, I would be interested in understanding what caused the disruption here, and how it could have been avoided - especially considering that one of the two jobs continued to run without problems after the migration and the other one failed.

I think that I know the answer. When we (analytics) migrated users from stat1005 to stat1007 I have sent several emails to alert about the move, especially to users holding an entry in their crontab. I sent an email with subject "Crons running on stat1005 likely to be moved to stat1007" but I didn't verify that all recipients answered to me. I believe that @Slaporte didn't move his crontab to stat1007, but @Jdcc-berkman did. This would explain the mistery :)

I see; that sounds like a reasonable way to notify cronjob owners. IIRC @Slaporte was also reaching out to @Jdcc-berkman back in December about the missing reports, but I guess there were too many moving parts.

Tue, Feb 12, 6:28 PM · User-Elukey, Analytics
Tbayer added a comment to T18691: RFC: Section headings should have a clickable anchor.

@osorio-juan-microsoft and I work on a wiki that has a lot of traffic coming directly into sections instead of pages, so we are working on a design that will mitigate the issues mentioned in T18691#1098570.

[..]
I'm not a designer, but personally I think this looks like a promising approach. Thanks for sharing!
Out of professional curiosity: Are you planning to measure the impact of this change on users' reading/linking behaviour in any way?

Tue, Feb 12, 6:33 AM · Readers-Web-Backlog (Design), TechCom-RFC, Design, MediaWiki-Interface
Tbayer added a comment to T215616: Improve interlingual links across wikis through Wikidata IDs.

The page_props table contains wikibase_item values for a given page ID. See e.g. T209891#4798717 for a query. that uses this.

Tue, Feb 12, 12:11 AM · MediaWiki-Database, Wikidata, DBA, Analytics, Research

Sun, Feb 10

Tbayer created T215744: Grafana shows zero EventLogging events for around 44 hours around January 15.
Sun, Feb 10, 11:11 PM · monitoring, Analytics

Sat, Feb 9

Tbayer closed T215379: Restoring the daily traffic anomaly reports as Resolved.

Closing this task now as the first regular daily report has arrived again and looks fine.

Sat, Feb 9, 12:58 AM · User-Elukey, Analytics
Tbayer updated subscribers of T215379: Restoring the daily traffic anomaly reports.

I did some tests swapping the recipients with me and Francisco, we have both received the email. Not sure if @Slaporte added the cron yesterday or not,

Yes, that is what happened - yesterday @Slaporte and I sat down a bit to look into this. He managed to retrieve the instructions he had received from @ZhouZ way back when this was handed over, including the exact crontab data with which we restored the job. @Slaporte was going to post an update here but I think wanted to wait until after the first successful run (which has now occurred), sorry that we ended up duplicating some work on this.

Sat, Feb 9, 12:51 AM · User-Elukey, Analytics

Thu, Feb 7

Tbayer added a comment to T215477: Tag Thanks actions with AMC tag.

(Recording a note from another conversation today:)
We may want to keep in mind the recent changes that were made as part of T60485: [Epic] Allow thanks of log entry, depending on how/whether we add the corresponding log pages into the AMC interface.

Thu, Feb 7, 11:30 PM · Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions, Thanks, Growth-Team
Tbayer updated the task description for T212959: Create AMC edit tag.
Thu, Feb 7, 11:10 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer updated the task description for T212959: Create AMC edit tag.
Thu, Feb 7, 10:44 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer added a comment to T215379: Restoring the daily traffic anomaly reports.

This is jdcc's crontab (I thought it was not an active user anymore, I was wrong):

Me too (based on T183291) ...

0 15 * * * USER=jdcc /home/jdcc/anaconda3/bin/python /home/jdcc/project_monitoring/scripts/check_projects.py > /home/jdcc/project_monitoring/reports/`date -Idate`.html 2>> /home/jdcc/errors.log
0 10 * * * USER=jdcc /home/jdcc/anaconda3/bin/python /home/jdcc/article_monitoring/scripts/check_articles.py > /home/jdcc/article_monitoring/reports/`date -Idate`.html 2>> /home/jdcc/errors.log

In both scripts I don't see any mail command used, so maybe it was configured in a different way on stat1005?

Thu, Feb 7, 9:23 PM · User-Elukey, Analytics
Tbayer added a comment to T215379: Restoring the daily traffic anomaly reports.

I think that the first step, if this job is important, could be to restore the cronjob under a user's crontab that will actively maintain it. The second step is to discuss how important this job is in the longer term, and if a different approach is worth investigating or not (like having data not sent via email but ending up on a SuperSet dashboard for example, etc..).

Agreed, but that second step should probably be a separate ticket. For now we're just focusing on restoring the status quo from before the server migration, while limiting the effort required from everyone involved.

@Tbayer IIRC you had the crontab commands that were executed on stat1005, can you post them in here?

No, I don't have them, and I did not seem to have the necessary user rights to view them when checking on stat1005 earlier. I was assuming that this would be easy for someone with root? (using crontab -u jdcc -l or such)

Thu, Feb 7, 3:31 AM · User-Elukey, Analytics
Tbayer added a comment to T215477: Tag Thanks actions with AMC tag.

@ovasileva @Tbayer clarification question so we are all on the same page. When should we tag "thanks" action with advanced mobile edit? I see two possibilities - when user who performs the action has amc mode enabled (when performing action), or do we want to tag thanks action when the revision thank is sent for was made with amc mode enabled?.

Thu, Feb 7, 1:54 AM · Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions, Thanks, Growth-Team

Wed, Feb 6

Tbayer added a comment to T212959: Create AMC edit tag.

@ovasileva FYI: current implementation of AMC revision tag is identical to the mobile web edit tag. If the AMC tag doesn't appear on some moderation actions - the mobile web edit tag will not be present.

If we find that the AMC tag is not present for all moderation actions, we should fix both AMC and mobile web edit.

Wed, Feb 6, 8:24 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer added a comment to T212959: Create AMC edit tag.

@Edtadros - leaving here for confirmation for the edit tag. After that we can move to signoff and create the task for QAing in production

A takeaway from yesterday's conversations (in standup and Slack) seems to be that we should be more specific about what we mean by confirming the edit tag. By now we can safely assume it exists in a technical sense (it shows up at https://readingwebstaging.wmflabs.org/wiki/Special:Tags ); in QA we should verify it is attached to the specific actions listed (including generic edits, as @Edtadros already did for one edit on January 22).

Wed, Feb 6, 8:18 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer moved T211842: Update Audiences page and Key Product Metrics deck with January 2019 Readers data from Blocked to Doing on the Product-Analytics board.
Wed, Feb 6, 6:23 PM · Product-Analytics
Tbayer created T215379: Restoring the daily traffic anomaly reports.
Wed, Feb 6, 4:09 AM · User-Elukey, Analytics
Tbayer added a comment to T215360: Allow video embeds in formats other than OGV (e.g. WEBM).

Compare also T116515: Enable embedding of media from Wikimedia Commons

Wed, Feb 6, 3:51 AM · Phabricator
Tbayer added a comment to T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't.

It will also be wise to exclude events that are happening (for 1 entity) at too high of a rate , even if marked as user , those indicate probably automated traffic. You can set a high threshold, like events from one entity with more than say 30 requests per minute are probably automated, User Agents on those case mean very little and that type of data is just going to add more noise.

Thanks for the suggestion! I'm not going to incorporate it into the data recommendation outcomes from this task for now, considering that this potential issue sounds more like something that would affect EventLogging in general, or at least multiple schemas. (The primary purpose of this task was to determine whether this particular schema shows widespread unexpected behaviour for entire browser families or (ranges of) browser versions.)

Wed, Feb 6, 12:15 AM · User-Ryasmeen, Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis

Tue, Feb 5

Tbayer moved T212963: One-time pageview peaks from Triage to Backlog on the Product-Analytics board.

Just so as to not leave this without reply for longer: WMF analysts usually don't have capacity to prioritize thorough investigations of such isolated incidents, in particular if it looks unlikely that they are indicative of a more widespread underlying technical problem. Also, the two peaks mentioned in the task (from August 2018 and March/April 2018, respectively) are outside the range for which we would still have more detailed webrequest data to investigate. On the other hand, the public pageview data (e.g. in the first case above, also note that this was concentrated on mobile web only) already makes it almost certain that this was artificial traffic.

Tue, Feb 5, 10:58 PM · Product-Analytics, Russian-Sites, Reading-analysis
Tbayer added a comment to T193051: Remove all page previews instrumentation code.

@Lea_WMDE Yes, happy to talk about the data side - feel free to post questions here or set up a quick call in case that seems preferable.

Tue, Feb 5, 9:26 PM · Page-Previews, Readers-Web-Backlog
Tbayer closed T214136: event_pageissues Turnilo view contains no valid data from before January 5 as Resolved.
Tue, Feb 5, 9:06 PM · Analytics-Kanban, Page-Issue-Warnings, Analytics
Tbayer updated the task description for T212959: Create AMC edit tag.
Tue, Feb 5, 7:28 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer added a comment to T212961: Add X-Analytics tag for AMC webrequests.

For the record, to track progress: We received followup questions from Legal on January 24, and replied on the same day. (Last week was taken up by the WMF All Hands meeting.)

Tue, Feb 5, 6:32 PM · Product-Analytics, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions

Mon, Feb 4

Tbayer created T215201: Document support for MariaDB EL data access in wmfdata.
Mon, Feb 4, 9:04 PM · Product-Analytics
Tbayer added a comment to T202594: wmfdata package can be installed but not imported.

I just encountered the same problem again when executing import wmfdata in a notebook that had already been running for a while. But after restarting the kernel, it worked fine. Perhaps that should be included as a tip in the installation instructions at https://github.com/neilpquinn/wmfdata ?

Mon, Feb 4, 8:44 PM · Contributors-Analysis, Product-Analytics

Wed, Jan 30

Tbayer added a comment to T214935: Examine clickthrough ratios for different page elements of action=history pages.

The analogon of the above result for logged-in users:

  • diff links: ca. 43%
  • All non-special pages (including e.g. user pages): ca. 23%
  • user contributions: 13%
  • old revisions: ca. 10%
  • other action=history views: ca. 6%
  • user pages: 4%
  • user talk pages: 3%
Wed, Jan 30, 1:33 AM · Readers-Web-Backlog, Product-Analytics
Tbayer added a comment to T214935: Examine clickthrough ratios for different page elements of action=history pages.

Augmenting the above result:

  • user pages: 4%
  • user talk pages: 2%
  • user contributions: 11%
Wed, Jan 30, 1:03 AM · Readers-Web-Backlog, Product-Analytics
Tbayer updated the task description for T214935: Examine clickthrough ratios for different page elements of action=history pages.
Wed, Jan 30, 12:26 AM · Readers-Web-Backlog, Product-Analytics

Tue, Jan 29

Tbayer updated the task description for T214935: Examine clickthrough ratios for different page elements of action=history pages.
Tue, Jan 29, 8:14 PM · Readers-Web-Backlog, Product-Analytics
Tbayer updated the task description for T214935: Examine clickthrough ratios for different page elements of action=history pages.
Tue, Jan 29, 7:34 PM · Readers-Web-Backlog, Product-Analytics
Tbayer added a comment to T214935: Examine clickthrough ratios for different page elements of action=history pages.

A first quick-and-dirty result from one day of data:

  • diff links: ca. 32%
  • All non-special pages: ca. 25%
  • old revisions: ca. 16%
  • other action=history views: ca. 13%
Tue, Jan 29, 7:05 PM · Readers-Web-Backlog, Product-Analytics
Tbayer created T214935: Examine clickthrough ratios for different page elements of action=history pages.
Tue, Jan 29, 6:54 PM · Readers-Web-Backlog, Product-Analytics

Mon, Jan 28

Tbayer updated subscribers of T200794: Analyze results of page issues A/B test.

I have been working on the "time spent on each page" question (which, again, has not been one of the success metrics here - rather, this is basically the first test drive of the new Reading Time metrics now that this data has been vetted and explored recently in the research project @Groceryheist has been working on , see https://meta.wikimedia.org/wiki/Research:Reading_time/Draft_Report ).

Mon, Jan 28, 4:46 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer added a comment to T200794: Analyze results of page issues A/B test.
Mon, Jan 28, 4:29 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis

Sat, Jan 26

MusikAnimal awarded T214721: Create #pageviews-anomaly tag a Like token.
Sat, Jan 26, 7:47 PM · Product-Analytics, Project-Admins

Fri, Jan 25

Tbayer added a comment to T214136: event_pageissues Turnilo view contains no valid data from before January 5.

@Tbayer
This is finished! Please check that pageissues and readingdepth contain the data that you expect.
Thanks for spotting this issue!

Pageissues looks great on Turnilo now, thank you!

Fri, Jan 25, 10:52 PM · Analytics-Kanban, Page-Issue-Warnings, Analytics
Tbayer moved T214721: Create #pageviews-anomaly tag from Triage to Tracking on the Product-Analytics board.
Fri, Jan 25, 7:36 PM · Product-Analytics, Project-Admins
Tbayer created T214721: Create #pageviews-anomaly tag.
Fri, Jan 25, 7:36 PM · Product-Analytics, Project-Admins
Tbayer updated the task description for T141385: Investigate adding OpenSearch to Pageviews Analysis.
Fri, Jan 25, 7:01 PM · Tool-Pageviews
Tbayer moved T214449: Estimate the rate of pageviews made using Tor from Triage to Backlog on the Product-Analytics board.
Fri, Jan 25, 6:11 PM · Tor, Product-Analytics

Thu, Jan 24

Tbayer updated subscribers of T214449: Estimate the rate of pageviews made using Tor.

@faidon pointed out that one might also be able to use the data MaxMind offers about this (assuming it is among the datasets we are obtaining from them).

Thu, Jan 24, 5:12 AM · Tor, Product-Analytics

Wed, Jan 23

Tbayer added a comment to T211841: Update Audiences page and Key Product Metrics deck with December 2018 Readers data.

The web data has caught up... over to @chelsyx and @mpopov .

Wed, Jan 23, 10:39 PM · Product-Analytics
Tbayer updated the task description for T211841: Update Audiences page and Key Product Metrics deck with December 2018 Readers data.
Wed, Jan 23, 10:37 PM · Product-Analytics
Tbayer created T214524: LDAP login advice on https://superset.wikimedia.org/ specifies wrong kind of login name.
Wed, Jan 23, 9:09 PM · User-Elukey, Analytics-Kanban, Analytics
Tbayer triaged T214449: Estimate the rate of pageviews made using Tor as Low priority.
Wed, Jan 23, 1:16 AM · Tor, Product-Analytics
Tbayer created T214449: Estimate the rate of pageviews made using Tor.
Wed, Jan 23, 1:16 AM · Tor, Product-Analytics

Tue, Jan 22

Tbayer changed the edit policy for T211197: Build AMC opt-in toggle.
Tue, Jan 22, 11:13 PM · MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer updated the task description for T211197: Build AMC opt-in toggle.
Tue, Jan 22, 11:13 PM · MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions
Tbayer added a comment to T211840: Update Audiences page and Key Product Metrics with November 2018 Readers data.

Over to you, @mpopov & @chelsyx ;)

Tue, Jan 22, 9:43 PM · Product-Analytics
Tbayer updated the task description for T211840: Update Audiences page and Key Product Metrics with November 2018 Readers data.
Tue, Jan 22, 9:42 PM · Product-Analytics
Tbayer updated subscribers of T214136: event_pageissues Turnilo view contains no valid data from before January 5.

[...]

Thanks! Could we include a slightly longer timespan? This is basically data from a one-time experiment that ran from September to November. (The data that is currently still coming in is doing so only at a very low rate, and could actually be discarded if necessary.)

Yes, I configured automatic deletion in Turnilo after 3 months as a default for EventLogging schemas loaded into Druid, for privacy reasons. However, I see that both event_pageissues and event_readingdepth are not privacy-sensitive, so we can keep them for longer.

@Tbayer So, you think we can stop the Druid/Turnilo ingestion of event_pageissues then? And leave the data of the September-November experiment? If so, I will backfill since September and remove the ingestion job.

Yes, that's correct. Thank you!

Tue, Jan 22, 7:57 PM · Analytics-Kanban, Page-Issue-Warnings, Analytics

Mon, Jan 21

Tbayer updated the task description for T213461: Define moderation actions.
Mon, Jan 21, 5:28 PM · Advanced Mobile Contributions
Tbayer updated the task description for T213461: Define moderation actions.
Mon, Jan 21, 5:21 PM · Advanced Mobile Contributions
Tbayer updated the task description for T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't.
Mon, Jan 21, 6:54 AM · User-Ryasmeen, Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer closed T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't as Resolved.

Thanks again to everyone who had weighed in with various insights, enabling us to launch the page issues A/B test without much further delay back in October!

Mon, Jan 21, 6:04 AM · User-Ryasmeen, Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer closed T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't, a subtask of T200792: [EPIC] Run A/B test on page issues (Farsi, Japanese, Russian, English), as Resolved.
Mon, Jan 21, 6:04 AM · Epic, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q2), Patch-For-Review, Page-Issue-Warnings, User-notice, Wikimedia-Site-requests
Tbayer added a comment to T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't.

I'm about to post the more detailed summary of the finding and data analysis recommendations that resulted from the above discussion, and then close this task, but just to follow up on some interesting remarks by @Krinkle:

...

Mobile Safari added support for Navigation Timing in iOS 9.0, not 10.x or 11.x. It was previously was available on iOS 8.0, but Apple removed it 8.1 due to problems with their implementation. It was back in 9.0 and has been available since.

This is a discrepancy we weren't able to resolve. (@Krinkle was referring here to the fact that based on our data in T204143#4653278, almost all iOS devices with version 11.3 and newere are sending ReadingDepth events, and almost all with older versions don't.)
We have been circumventing this issue by conservatively excluding data from all versions prior to 11.3 , see T204143#4661935 .

Mon, Jan 21, 5:38 AM · User-Ryasmeen, Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer added a comment to P7948 Shell script to launch Hive/Beeline query and send an email with the result.

To use this (on e.g. stat1004), save it as a bash script file (e.g. using nano myhivequery.bash), make the file executable, then launch it in a screen session: screen ./myhivequery.bash

Mon, Jan 21, 4:32 AM

Jan 18 2019

Tbayer added a comment to T214136: event_pageissues Turnilo view contains no valid data from before January 5.

@Tbayer Thanks for the heads up. I think I know what happened.

On January 5th, I deployed a new feature of HiveToDruid, the job that loads data from EventLogging to Druid, and Turnilo. And together with it, I deployed another fix that was asked by a couple people: unifying the 2 confusing metrics that the job would generate ("Count" and "Event Count") into one single metric.

Yes, I was among those people and greatly appreciate that fix ;)

To do that, I had to rename "Event Count" to "Count", to override the latter, which is added by default by Turnilo. When I finished the tests and the deployment, both metrics were still visible for the time period prior to Jan 5th, and thus all data was queryable. However, I believe after some days have passed, Turnilo is not looking at the old data any more when applying introspection, and has dropped the old "Event Count" metric. Without it, the old data, even if it's still there, can not be queried from Turnilo...

I see - does that also explain the "No data was returned" error in Superset?

I will launch backfill jobs to load the last 3 months of data with the new code. This will fix the problem, but might take a couple days.

Thanks! Could we include a slightly longer timespan? This is basically data from a one-time experiment that ran from September to November. (The data that is currently still coming in is doing so only at a very low rate, and could actually be discarded if necessary.)

The 2 previous metrics were confusing, and only the latter was used. The only way that I found to remove the non-useful "Count" metric, which is added by default by Turnilo, was overriding it with the EventCount value

Jan 18 2019, 9:55 PM · Analytics-Kanban, Page-Issue-Warnings, Analytics
Tbayer added a comment to T200794: Analyze results of page issues A/B test.

Regarding the question "where do people go after the modal", here are the clickthrough rates for four kinds of links shown in the modal. Unsurprisingly the "X" to close the modal is the most frequently used one. Internal links (e.g. to https://en.m.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view from the POV template modal on enwiki) are fairly popular too, with large variation between the five wikis in the test.

Jan 18 2019, 5:16 PM · Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer added a comment to T200794: Analyze results of page issues A/B test.

Here is some data on the question "Does clickthrough depend on the severity of each issue?", for page-level issues in the new design on enwiki (recall that the severity level data is assumed to be only valid on enwiki, and that the old design doesn't reveal the severity level before clicking on the issue notice).

Jan 18 2019, 8:00 AM · Readers-Web-Backlog (Tracking), Product-Analytics, Reading-analysis
Tbayer created T214136: event_pageissues Turnilo view contains no valid data from before January 5.
Jan 18 2019, 2:45 AM · Analytics-Kanban, Page-Issue-Warnings, Analytics
Tbayer added a comment to T210687: Bug: can't make a YoY time series chart in Superset.

OK, that works, thanks (here is the fixed link for others who might like to try it out too).

Jan 18 2019, 1:04 AM · Analytics-Kanban, Product-Analytics, Analytics

Jan 17 2019

Tbayer updated subscribers of T208795: Measure Google Translate Pageview Impact.

To get a sense of proportion, it seems there were 460k of those Google-translated views overall on January 9 - quite a considerable number.

Jan 17 2019, 10:26 PM · Patch-For-Review, Reading-Admin
Tbayer added a comment to T210687: Bug: can't make a YoY time series chart in Superset.

For some reason, that link only shows me data up to March - even after changing the metric to SUM(view_count) (which I guess was intended) and hitting "Run query" again (new link):

Jan 17 2019, 7:41 PM · Analytics-Kanban, Product-Analytics, Analytics
Tbayer added a comment to T212516: WikimediaEvents do not track logged in beta users on Special:MobileOptions.
Jan 17 2019, 4:41 PM · Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), MW-1.33-notes (1.33.0-wmf.12; 2019-01-08), Advanced Mobile Contributions, MobileFrontend, MediaWiki-extensions-WikimediaEvents
Tbayer added a comment to T212959: Create AMC edit tag.

@ovasileva @Tbayer - just out of curiosity - both advanced mobile edit and mobile web edit are redundant. Do we want to keep both (keeping both is much easier, showing only one - mobile web edit or advanced mobile edit is bit more complex). I'm just thinking aloud. The current implementation will add advanced mobile edit tag everytime AMC mode is on. That edit will be marked both with advanced mobiled edit and mobile web edit. Is it ok?

Yes, this obviously adds redundancy, but the alternative would create a lot of complications, e.g. for metrics that are based on the existing tags. (Similarly, when we added OS-specific edits tags for the apps last year in T197175 , we left the existing tags in place, leading to redundancy such as one edit being tagged with Mobile edit, Mobile app edit, iOS app edit.)

Also, do we want to tag desktop edits with advanced mobile edit? Most probably no. But if we decide to tag every edit when AMC mode is on - then we will be able to find how many desktop edits AMC users do (those edits will be tagged with advanced mobile edit but not with mobile web edit tag.

No, only edits made using the actual AMC edit interface should be tagged as such.

Jan 17 2019, 2:24 PM · MW-1.33-notes (1.33.0-wmf.14; 2019-01-22), Patch-For-Review, Readers-Web-Backlog (Readers-Web-Kanbanana-Board-2018-19-Q3), Advanced Mobile Contributions

Jan 15 2019

Tbayer added a comment to T213602: virtualpageview_hourly lacks data from December 17 on.

Great, thank you @Ottomata and everyone else for solving this so quickly!

Jan 15 2019, 1:08 PM · Analytics-Kanban, Analytics

Jan 14 2019

Tbayer added a comment to T212172: Provide feature parity between the wiki replicas and the Analytics Data Lake.

It seems we have collected enough use cases already to facilitate the present discussion, but to briefly sketch another current example:

Jan 14 2019, 4:01 PM · User-Elukey, Epic, Analytics, Contributors-Analysis, Product-Analytics