Page MenuHomePhabricator

Double-check language switching instrumentation
Open, HighPublic5 Estimated Story Points

Description

Background

In parallel to beginning to iterate on the language button based on the results from T289200: Analyze behavior change on logged-out users before/after turn-on new language switch, we would also like to double-check our instrumentation to confirm that events are being logged as expected

Acceptance criteria

  • Review language instrument and ensure events are logged correctly for both clicks to the sidebar as well as the new language button
  • Review SQL queries and check the expectations around the things they are testing match the code

Event Timeline

ovasileva created this task.
Jdlrobson updated the task description. (Show Details)
Jdlrobson added a subscriber: cjming.

@Jdlrobson here are the queries I used to pull data. Please review whether the queries matches with the schema code. If there is any mismatch, we can schedule a meeting next week to discuss if your time allowed.

Query for clicks on language links after clicking new button

query_language_clicks_new <- 
"
-- sessions where new lang button was selected
WITH new_button AS (
SELECT
    MIN(TO_DATE(dt)) as button_date,
    event.web_session_id as session_id,
    event.context as open_context,
    wiki as wiki
FROM event.universallanguageselector
WHERE
    year = 2021 and month in (6,7,8) 
    AND (CONCAT(year,LPAD(month,2,'0'),LPAD(day,2,'0')) between '20210622' and '20210831')
    AND wiki in ('frwiktionary', 'hewiki', 'ptwikiversity', 'frwiki', 
    'euwiki', 'fawiki', 'ptwiki', 'kowiki', 'trwiki', 'srwiki', 'bnwiki', 'dewikivoyage', 'vecwiki')
    AND useragent.is_bot = false
-- only anonymous users
    AND event.isanon = true
    AND event.action = 'compact-language-links-open'
    AND event.context = 'header'
    AND event.skinVersion = 'latest'
GROUP BY 
    event.web_session_id,
    event.context,
    event.isanon,
    wiki
),

lang_switches AS (
  SELECT
    TO_DATE(dt) as switch_date,
    event.web_session_id as session_id,
    event.context as switch_context,
    wiki as wiki
FROM event.universallanguageselector
WHERE
    year = 2021 and month in (6,7,8) 
    AND (CONCAT(year,LPAD(month,2,'0'),LPAD(day,2,'0')) between '20210622' and '20210831')
    AND wiki in ('frwiktionary', 'hewiki', 'ptwikiversity', 'frwiki', 
    'euwiki', 'fawiki', 'ptwiki', 'kowiki', 'trwiki', 'srwiki', 'bnwiki', 'dewikivoyage', 'vecwiki')
    AND useragent.is_bot = false
-- only anonymous users
    AND event.isanon =true
    AND event.action = 'language-change'
    AND event.context='content-language-switcher'
    AND event.skinVersion = 'latest'
)

SELECT
    new_button.button_date,
    lang_switches.switch_date,
    new_button.session_id,
    new_button.wiki,
    new_button.open_context,
-- sessions with lang switch that occured after button clicks
    IF(lang_switches.session_id IS NOT NULL AND switch_date >= button_date, 1, 0) AS language_switch,
    lang_switches.switch_context
FROM new_button
LEFT JOIN lang_switches ON
    new_button.session_id = lang_switches.session_id AND
    new_button.wiki = lang_switches.wiki 
"

Query for clicks on language links on the sidebar.

query_old_language_link <- 
"
SELECT
    TO_DATE(dt) AS `date`,
    wiki,
    event.web_session_id,
    event.usereditbucket,
    event.timetochangelanguage,
    event.interfacelanguage,
    event.contentlanguage,
    event.selectedinterfacelanguage,
    Count(*) AS n_events
FROM event.universallanguageselector
WHERE
    year = 2021 and month in (6,7,8) 
    AND (CONCAT(year,LPAD(month,2,'0'),LPAD(day,2,'0')) between '20210601' and '20210628')
    AND event.context = 'languages-list' and event.action = 'language-change' 
    AND event.skinVersion = 'latest' 
    AND wiki in ('frwiktionary', 'hewiki', 'ptwikiversity', 'frwiki', 
    'euwiki', 'fawiki', 'ptwiki', 'kowiki', 'trwiki', 'srwiki', 'bnwiki', 'dewikivoyage', 'vecwiki')
    AND useragent.is_bot = false 
-- only anonymous users
    AND event.isanon =true
GROUP BY
    TO_DATE(dt),
    wiki,
    event.web_session_id,
    event.usereditbucket,
    event.timetochangelanguage,
    event.interfacelanguage,
    event.contentlanguage,
    event.selectedinterfacelanguage
"

Query for clicks on language links after N-More window shows

query_n_more_lang_switch <- 
"
-- sessions where N-more lang button was selected
WITH button AS (
SELECT
    MIN(TO_DATE(dt)) as button_date,
    event.web_session_id as session_id,
    event.context as open_context,
    wiki as wiki
FROM event.universallanguageselector
WHERE
    year = 2021 and month in (6,7,8) 
    AND (CONCAT(year,LPAD(month,2,'0'),LPAD(day,2,'0')) between '20210601' and '20210628')
    AND wiki in ('frwiktionary', 'hewiki', 'ptwikiversity', 'frwiki', 
    'euwiki', 'fawiki', 'ptwiki', 'kowiki', 'trwiki', 'srwiki', 'bnwiki', 'dewikivoyage', 'vecwiki')
    AND useragent.is_bot = false
-- only anonymous users
    AND event.isanon =true
    AND event.action = 'compact-language-links-open'
    AND event.context = 'other'
    AND event.skinVersion = 'latest'
GROUP BY 
    event.web_session_id,
    event.context,
    wiki
),

lang_switches AS (
  SELECT
    TO_DATE(dt) as switch_date,
    event.web_session_id as session_id,
    event.isanon,
    event.context as switch_context,
    wiki as wiki
FROM event.universallanguageselector
WHERE
    year = 2021 and month in (6,7,8) 
    AND (CONCAT(year,LPAD(month,2,'0'),LPAD(day,2,'0')) between '20210601' and '20210628')
    AND wiki in ('frwiktionary', 'hewiki', 'ptwikiversity', 'frwiki', 
    'euwiki', 'fawiki', 'ptwiki', 'kowiki', 'trwiki', 'srwiki', 'bnwiki', 'dewikivoyage', 'vecwiki')
    AND useragent.is_bot = false
-- only anonymous users
    AND event.isanon =true
    AND event.action = 'language-change'
    AND event.context='content-language-switcher'
    AND event.skinVersion = 'latest'
)

SELECT
    button.button_date,
    lang_switches.switch_date,
    button.session_id,
    button.wiki,
    button.open_context,
-- sessions with lang switch that occured after button clicks
    IF(lang_switches.session_id IS NOT NULL AND switch_date >= button_date , 1, 0) AS language_switch,
    lang_switches.switch_context
FROM button
LEFT JOIN lang_switches ON
    button.session_id = lang_switches.session_id AND
    button.wiki = lang_switches.wiki
"
Aklapper renamed this task from Double-check language switching intrumentation to Double-check language switching instrumentation.Tue, Sep 28, 4:28 PM

Posted below are the event stream logs from my local eventlogging server to see what event data is captured for logged-in (even/odd user ids) and anonymous users clicking on either the language button in the header or a language link in the sidebar.

Missing data is due to running things locally (i.e. selectedInterfaceLanguage). I followed the QA steps in T280825.

Some observations between @jwang's queries and the event log json I'm seeing locally:

  • 1st query query_language_clicks_new:
    • In the case of the language button in header (when event.action = 'compact-language-links-open'), event.context is logged as "other" while in the query event.context = 'header'.
      • Does the new_button query need to be updated with the condition AND event.context = 'other'? Fwiw, button in query_n_more_lang_switch looks correct.
    • lang_switches select lgtm
  • 2nd query query_old_language_link seems accurate.
  • 3rd query query_n_more_lang_switch seems good too.

I'll keep digging but this is what I've encountered so far.

Logged in, even user id, clicked language in header button (default)
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "5e8644cdbfa039eda26f",
    "isAnon": false,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "compact-language-links-open",
    "context": "other",
    "userEditBucket": "0 edits"
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T19:34:07.313Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "1381b4ef-4442-4a81-b984-540e4325b52c",
    "dt": "2021-10-07T19:34:08.589Z",
    "request_id": "8a318130-27a5-11ec-a93e-8936a01f4b42"
  },
  "dt": "2021-10-07T19:34:08.589Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}
Logged in, even user id, clicked language in header button (default) and clicked language link (user preference set to use compact language list)
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "0560fbfbf12184203f65",
    "isAnon": false,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "language-change",
    "context": "content-language-switcher",
    "selectedInterfaceLanguage": "mg",
    "timeToChangeLanguage": 24846.300048828125,
    "userEditBucket": "0 edits"
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T22:49:18.637Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "1c1c5bcd-82dd-4272-a9e8-1d1ed4c825f4",
    "dt": "2021-10-07T22:49:19.035Z",
    "request_id": "ce5231a0-27c0-11ec-911c-8ff49b5ee1ee"
  },
  "dt": "2021-10-07T22:49:19.035Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}
Logged in, even user id, clicked on language in sidebar (using query param ?languageinheader=0)
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "5e8644cdbfa039eda26f",
    "isAnon": false,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "language-change",
    "context": "languages-list",
    "selectedInterfaceLanguage": "",
    "timeToChangeLanguage": 45491.10009765625,
    "userEditBucket": "0 edits"
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T19:36:35.740Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "530aad71-a613-4805-bc2e-ffa6a53a26fa",
    "dt": "2021-10-07T19:36:36.749Z",
    "request_id": "e2a95cc0-27a5-11ec-a93e-8936a01f4b42"
  },
  "dt": "2021-10-07T19:36:36.749Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}
Logged in, odd user id, clicked on language in header (using query param ?languageinheader=1)
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "5e8644cdbfa039eda26f",
    "isAnon": false,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "compact-language-links-open",
    "context": "other",
    "userEditBucket": "0 edits"
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T20:37:02.959Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "6a170446-e337-44e3-918c-415a200709b9",
    "dt": "2021-10-07T20:37:03.966Z",
    "request_id": "54a6c0d0-27ae-11ec-a93e-8936a01f4b42"
  },
  "dt": "2021-10-07T20:37:03.966Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}
Logged in, odd user id, clicked on language in sidebar (default)
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "5e8644cdbfa039eda26f",
    "isAnon": false,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "language-change",
    "context": "languages-list",
    "selectedInterfaceLanguage": "",
    "timeToChangeLanguage": 5677.400146484375,
    "userEditBucket": "0 edits"
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T19:39:51.520Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "b52f3fd7-bd9f-47d5-8bf1-1fa6f20e619a",
    "dt": "2021-10-07T19:39:52.524Z",
    "request_id": "575a41b0-27a6-11ec-a93e-8936a01f4b42"
  },
  "dt": "2021-10-07T19:39:52.524Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}
Anonymous, clicked on language in sidebar
event: {
  "event": {
    "version": 3,
    "token": "",
    "contentLanguage": "en",
    "interfaceLanguage": "en",
    "web_session_id": "5e8644cdbfa039eda26f",
    "isAnon": true,
    "skin": "vector",
    "skinVersion": "latest",
    "action": "language-change",
    "context": "languages-list",
    "selectedInterfaceLanguage": "",
    "timeToChangeLanguage": 56165.60009765625
  },
  "schema": "UniversalLanguageSelector",
  "webHost": "localhost",
  "wiki": "my_wiki",
  "$schema": "/analytics/legacy/universallanguageselector/1.4.0",
  "client_dt": "2021-10-07T19:42:13.846Z",
  "meta": {
    "stream": "eventlogging_UniversalLanguageSelector",
    "domain": "localhost",
    "id": "4d274db5-5dd1-41a6-93be-517e1edd3d2c",
    "dt": "2021-10-07T19:42:14.852Z",
    "request_id": "ac2fc930-27a6-11ec-a93e-8936a01f4b42"
  },
  "dt": "2021-10-07T19:42:14.852Z",
  "http": {
    "request_headers": {
      "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36"
    }
  }
}

Hi @cjming

According to the Instrumentation Spec, new button clicks are identified by event.context = 'header'. event.context = 'other' is the value for the old link clicks on the sidebar.

According to the Instrumentation Spec, new button clicks are identified by event.context = 'header'. event.context = 'other' is the value for the old link clicks on the sidebar.

hi @jwang - looks like we need to investigate this more. You're right that the spec indicates event.context should be header for your queries on the new language button but my experience in testing what event data is actually getting logged repeatedly returns other when clicking on that link. I'll do some more digging to see how that value is set and report back.

Change 728662 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/extensions/WikimediaEvents@master] Update ULS compact language link log trigger id

https://gerrit.wikimedia.org/r/728662

So I think what's happening here has to do with when the Use a compact language list is set in preferences. Clicking on that compact language link , regardless of whether it's in the header or in the sidebar, sends the context as other in eventlogging.

Compact language link in sidebar:

Screen Shot 2021-10-08 at 2.40.03 PM.png (1×4 px, 2 MB)

Compact language list in header:

Screen Shot 2021-10-08 at 2.41.08 PM.png (2×4 px, 1 MB)

Seems like the ULS js in WikimediaEvents isn't catching the right trigger:

function ulsCompactLanguageLinksOpen( $trigger ) {
	log( {
		action: 'compact-language-links-open',
		context: skin === 'vector' && $trigger.is( '#p-lang-btn-label' ) ? 'header' : 'other'
	} );
}

Note that the trigger for the log function is the id for the h3 label rather than the id of the input element which has an id of #p-lang-btn-checkbox. So context will always return as 'other' instead of 'header'.

Screen Shot 2021-10-08 at 3.14.03 PM.png (2×5 px, 2 MB)

When I replace the id so that

context: skin === 'vector' && $trigger.is( '#p-lang-btn-checkbox' ) ? 'header' : 'other'

event.context gets set correctly as 'header':

Screen Shot 2021-10-08 at 3.28.34 PM.png (1×5 px, 1 MB)

I submitted a patch for above change.

As for next steps, do we continue gathering data (assuming above change is approved) ?

Nice sleuthing @cjming ! If I'm understanding correctly, wouldn't this bug mean that there would be no context: header events?

It looks like we moved the mw-interlanguage-selector class from the button to the checkbox in https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/713008/9/includes/SkinVector.php (merged September 3rd), and the A/B test finished on July 20th. Are there similar problems with the logging with the code prior to that change?

Did we look at the events in ui.dropdown-p-lang-btn DesktopClickTracking? Those should have been logged for all users at 20% sampling. We could look at the logged-in events there, to see if they support the hypothesis being built here (ie. is there more events to that button in DesktopClickTracking than click events in UniversalLanguageSelector).

Change 728662 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Update ULS compact language link log trigger id

https://gerrit.wikimedia.org/r/728662

If I'm understanding correctly, wouldn't this bug mean that there would be no context: header events?

that's my understanding - per your comment since the time that the class change occurred (9/3)

It looks like we moved the mw-interlanguage-selector class from the button to the checkbox in https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/713008/9/includes/SkinVector.php (merged September 3rd), and the A/B test finished on July 20th. Are there similar problems with the logging with the code prior to that change?

not sure - I'll keep digging to give you a definitive answer

Did we look at the events in ui.dropdown-p-lang-btn DesktopClickTracking? Those should have been logged for all users at 20% sampling. We could look at the logged-in events there, to see if they support the hypothesis being built here (ie. is there more events to that button in DesktopClickTracking than click events in UniversalLanguageSelector).

I didn't notice anything getting logged for that but I'll confirm

Did we look at the events in ui.dropdown-p-lang-btn DesktopClickTracking? Those should have been logged for all users at 20% sampling. We could look at the logged-in events there, to see if they support the hypothesis being built here (ie. is there more events to that button in DesktopClickTracking than click events in UniversalLanguageSelector).

When I turned on DesktopWebUIActionsTracking locally, the only other event I see getting logged besides ULS would not have shown up in the queries:

Screen Shot 2021-10-15 at 4.04.32 PM.png (2×4 px, 2 MB)

Notably, no event is triggered by DesktopWebUIActionsTracking when I click on the compact language link button in the sidebar.

It looks like we moved the mw-interlanguage-selector class from the button to the checkbox in https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/713008/9/includes/SkinVector.php (merged September 3rd), and the A/B test finished on July 20th. Are there similar problems with the logging with the code prior to that change?

I delved into the git history of the ULS js in WikimediaEvents. It seems like the context for the compact language link in the header was added in early June (merged 6/1, backported 6/8). This is well before the A/B test was deployed (6/22)

Also went through git history of SkinVector.php around that same time - I haven't come across a commit (though there are a lot of them and I very well could be overlooking something) that might account for any possible explanation of the unanticipated data results.

Not sure if there are other places to look. I can spend more time digging but it feels a bit like looking for a needle in a haystack. If there are no objections, I'm inclined to send this to sign off.