Page MenuHomePhabricator

MinT for Readers: Implement instrumentation for key events
Open, In Progress, MediumPublic

Description

As part of the work on MinT for Wikipedia Readers MVP (T359072), an schema for instrumentation has been defined using the new data platform. This ticket proposes to implement the instrumentation for a set of key events. You can check the documentation for more details on how to use the new platform in the implementation.

Form the designed schema (T341185) the key events are the following:

  • session initiation
  • users searches for a topic
  • users selects an article
  • user clicks to view to automatic translation
  • user clicks to view human generated content
  • user closes the automatic translation view

Derive Requirement

Implement the instrumentation for key user events related to the MinT MVP for Wikipedia Readers using the new data platform. The key events to be tracked are:

Session initiation
User searches for a topic
User selects an article
User clicks to view automatic translation
User clicks to view human-generated content
User closes the automatic translation view
All events should log the required properties to ensure proper tracking and analytics.

BDD

Feature: MinT MVP for Wikipedia Readers instrumentation

Scenario: Session initiation
  Given a user visits the Wikipedia site
  When the user lands on the automatic translation page
  Then the session initiation event should be logged

Scenario: User searches for a topic
  Given the user is on the automatic translation page
  When the user searches for a topic
  Then the "search_topic" event should be logged

Scenario: User selects an article
  Given the user has searched for a topic
  When the user selects an article from the search results
  Then the "select_article" event should be logged

Scenario: User clicks to view automatic translation
  Given the user has selected an article
  When the user clicks to view the automatic translation
  Then the "view_automatic_translation" event should be logged

Scenario: User clicks to view human-generated content
  Given the user is on the automatic translation page
  When the user clicks to view the human-generated content
  Then the "view_human_content" event should be logged

Test Steps

Pre-requisite:

-Ensure debug logs are enabled by executing the following code in the browser developer console:
javascript
Copy code:

mw.loader.using( 'mediawiki.api' ).then( function () {
  new mw.Api().saveOption( 'eventlogging-display-web', '1' );
});

Test Case 1: Session initiation

Visit a Wikipedia site (e.g., https://es.wikipedia.org).
Open the Automatic Translation special page.
Expected Result:
✅❓❌⬜ AC1: Confirm that a "session initiation" event is logged in the console, containing the relevant session properties.

Test Case 2: User searches for a topic

On the Automatic Translation page, search for a topic.
Expected Result:
✅❓❌⬜ AC2:Confirm that a "search_topic" event is logged in the console, including properties like action: "search", action_source: "topic_search".

Test Case 3: User selects an article
After searching for a topic, select an article from the search results.
Expected Result:
✅❓❌⬜ AC3:Confirm that a "select_article" event is logged, with properties like action: "click", action_source: "article_selection".

Test Case 4: User clicks to view automatic translation
After selecting an article, click to view the automatic translation.
Expected Result:
✅❓❌⬜ AC4:Confirm that a "view_automatic_translation" event is logged, including properties such as action: "click", action_source: "automatic_translation_card".

Test Case 5: User clicks to view human-generated content
Click on the option to view human-generated content (if available).
Expected Result:
✅❓❌⬜ AC5:Confirm that a "view_human_content" event is logged, with properties like action: "click", action_source: "human_content".

QA Results - ES Automatic Translation PROD

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Hey @KCVelaga! I have some questions regarding the instrumentation of MinT for Wikipedia Readers MVP:

  1. could you please provide me the stream name for these events?
  2. given that the action_context should be a string, I suppose that the auto_translation_card case should be a string in this format: sourceLanguage;targetLanguage, where sourceLanguage and targetLanguage are the values of the source and target language codes, separated by a semi-colon. Is my assumption correct?
  3. The "users searches for a topic" event should be a "click" action according to the spec. My assumption would be that this event will be triggered when the user actually types a query inside the search input. Am I missing something here?

Change #1029238 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] MinT MVP: Add basic instrumentation

https://gerrit.wikimedia.org/r/1029238

ngkountas changed the task status from Open to In Progress.May 9 2024, 8:17 AM
ngkountas claimed this task.

Hi @ngkountas

  1. Stream name: mediawiki.mint_for_readers
  2. Regarding action_context for auto_translation_card: yes, you are right, it should be language codes separated by a semi-colon. In the spec document, @phuedx proposed using a page object (a new schema fragment?), which can also capture page_id if required, as something like, source_page: {lang: 'en', 'id': 1234}. Sam: can you explain more about that here?
  3. For user searching a topic: you are right, this should be initiated when users types something in the search input. Thinking again on this, click doesn't make sense for this, instead we can restructure it as action: search
  1. Regarding action_context for auto_translation_card: yes, you are right, it should be language codes separated by a semi-colon. In the spec document, @phuedx proposed using a page object (a new schema fragment?), which can also capture page_id if required, as something like, source_page: {lang: 'en', 'id': 1234}. Sam: can you explain more about that here?

Sure.

The MP Web base schema has a top-level page object, which has a number of auto-fillable properties about the current page. My question is/was: Are we only talking about a target language or are we actually talking about a target page? If so, perhaps we could create a schema fragment with a single property, target_page and, perhaps, create an API for filling out that property cleanly? For example:

namespace mw {
  interface eventLog {
    createPage( title: mw.Title ): Partial<Page>;
    createPage( title: mw.Title, Partial<Page> additionalProperties ): Partial<Page>;
  }
}

Thanks @phuedx !

Thinking beyond just this event, this is a use-case that will pop-up across various schemas related to the Language team. The ability to capture both language code and page id for both source and language will be beneficial in the longer term. The approach you suggested sounds good. It will also make it easier for analysis, as compared to extracting values from a string in action_context.

Thanks for your input @phuedx! To clarify, for the auto_translation_card action_source it makes sense to use a source_page object for the source language and the source page id. About the target page, I don't believe that the target page id offers a lot of value to be logged, as any translation can be uniquely identified by the source language, the target language and the source page id (or source page title). The same is true for translations in Content and Section Translation applications that are developed/maintained by the Language Team. It's also worth noting that for some events inside Content/Section Translation app (e.g. dashboard_open event) only the source-target language pair is needed, and no page id is used.

Finally, could you provide an example about how a page object can be used with mw.eventLog.submitInteraction or mw.eventLog.submitClick?

@KCVelaga_WMF thank you for your answers. About the "users searches for a topic", if we change the action to search we will also need a schemaId to be used with the mw.eventLog.submitInteraction method. Do you have any idea what value should be used as schemaId?

@ngkountas I am not sure about schemaId, I will check and get back to you.

Change #1029238 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] MinT MVP: Add basic instrumentation

https://gerrit.wikimedia.org/r/1029238

Update: @ngkountas and I met with the Metrics Platform team (thank you @Sfaci for the walkthrough). Here is a summary

  • For consistency with other streams, stream name should be mediawiki.product_metrics.mint_for_readers
  • Stream configuration and registration should be added to https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/%2B/master/wmf-config/ext-EventStreamConfig.php
    • As the feature will be a Special: page, Nik rightly pointed out that capturing page_id and other page_ related information will not be of much use as it will be the same for all events. We are already capturing the actual article with the interaction data. Apart from that, the following fields will be helpful, apart from the core fields (to be added to provide_values in the configuration)
    • agent
      • client_platform
      • client_platform_family
    • performer
      • is_logged_in
      • performer_name
      • session_id
      • groups
      • is_bot
      • registration_dt

I will create a seperate sub-task for stream configuration and registration.

@ngkountas I have submitted a patch for stream configuration and registration. As per the discussion with MP team yesterday, please change the stream name in instrumentation to mediawiki.product_metrics.mint_for_readers

Change #1048400 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] AX instrumentation: Update stream name

https://gerrit.wikimedia.org/r/1048400

Thank you @KCVelaga_WMF! I just submitted the patch for updating the stream name. Do you need me to review your patch, too?

@ngkountas Yes, I have add you and engineers from the Metrics Platform team as reviewers for that.

Change #1048400 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] AX instrumentation: Update stream name

https://gerrit.wikimedia.org/r/1048400

Based on the QA comments (copied below) there are a couple of aspects to polish for the instrumentation to work as expected. Moving the ticket back to in development:

  • Approximately ~560 events had validation errors till now. The error is the same for all of them: '.performer' should NOT have additional properties
    • When I checked the raw events, event were using a different schema version than the rest i.e. /analytics/product_metrics/web/base/1.0.0 instead of /analytics/product_metrics/web/base/1.2.0
  • Also, it seems like all the errored events were only users clicking the automatic translation card. If a user selected the automatic translation of an article, there should be a preceding event of article selection (which is part of the key events), and that seems to be missing. Can you please check if this being logged and also the human translation selection?

These validation errors are not caused by our implementation, but rather by a bug in the mw.eventLog.submitClick method of the Metrics Platform repository. All other events, that are not click events, (e.g. search, session_init etc) are properly logged, as far as I am concerned. A pull request has been submitted to fix the issue with the above method, and according to the plan it will be backported on Monday.

I'm moving this task to "Waiting for deployment" column. @KCVelaga_WMF do you think you could QA this task once the PR is merged and deployed?

I'm moving this task to "Waiting for deployment" column. @KCVelaga_WMF do you think you could QA this task once the PR is merged and deployed?

The change has been deployed. I'm monitoring https://logstash.wikimedia.org/goto/504f7423ac1d4643e285e8cf4b80cb33.

Thank you @phuedx. I don't see any more validation errors related the schema version. The data also has click events for automatic and human translation selections.

@ngkountas Can you double check if the preceding event to user selecting an article from search before selecting automatic translation or human translation is being logged or not? (3rd event in the list of keys events)

I can confirm that click events for user selecting an article are being logged. We have ~30+ events so far.

However, we have ~1000+ events for users clicking the auto_translation_card, so I am wondering about the possibility if users can go to clicking the automatic translation card, without selecting an article.

However, we have ~1000+ events for users clicking the auto_translation_card, so I am wondering about the possibility if users can go to clicking the automatic translation card, without selecting an article.

Not sure about the instrumentation implementation details. But in terms of the user workflow there are multiple possible paths.
The general workflow consists of the following steps Home -> Search -> Confirm -> Translation View
Users can directly access any of them through the url. Entry points direct them to the Confirm step directly where the article and language pairs have been selected based on the context where the user comes form.

QA Results

The following events have been verified to be properly logged:

session_init (session initiation):

search (users searches for a topic):

"search_result" click (users selects an article):

"auto_translation_card" click (user clicks to view to automatic translation):

"human_translation_card" click (user clicks to view human generated content):

Missing piece: The only current unsupported event for this task is the "close" click event (user closes the automatic translation view). Moving this task back to priority log, until this is also fixed.

PWaigi-WMF renamed this task from MinT MVP: Implement instrumentation for key events to MinT for Readers: Implement instrumentation for key events .Aug 20 2024, 8:57 AM

Change #1064047 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] MinT for Readers: Log "view" event after content has been loaded

https://gerrit.wikimedia.org/r/1064047

Change #1064048 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] MinT for Readers: Instrument "close automatic translation" event

https://gerrit.wikimedia.org/r/1064048

Change #1064047 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] MinT for Readers: Log "view" event after content has been loaded

https://gerrit.wikimedia.org/r/1064047

Change #1064048 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] MinT for Readers: Instrument "close automatic translation" event

https://gerrit.wikimedia.org/r/1064048

@ngkountas Is this QA testable and if so, can you please provide me the QA steps for me? Thanks!

@GMikesell-WMF yes, this task is QA testable. This is a task about instrumentation, so the easiest way to reproduce it, is to enable debug logs for each logged event inside your browser developer console, and also as popups inside your screen.

Step 1:
To enable these logs, visit a production Wikipedia site (e.g. https://es.wikipedia.org), open your browser developer console, and execute this small piece of code:

mw.loader.using( 'mediawiki.api' ).then( function () {
    new mw.Api().saveOption( 'eventlogging-display-web', '1' );
} );

After refreshing the page, the debug logs will be enabled, and you'll see a popup message inside your screen, and a console message (with a JSON object) for every instrumentation event logged in the site.

Step 2:
Visit the Automatic Translation special page -aka MinT MVP for Readers - in any wiki (preferably not English wiki). Search for any article to get the automatic translation, and continue all the way until the final screen, where the user can read the automatic translation of the article. You can see my screencasts above, for some examples of how this tool works. For instance, if you check the screencast for the test scenario "auto_translation_card" click (user clicks to view to automatic translation) above, you can see the "View Automatic Translation" page, where the user can read the automatic translation of the article.

Step 3:
Inside this "View Automatic Translation" screen, click on the close button (next to the "Automatic Translation" header text). A new instrumentation event should be logged. You can check your browser dev console to find the logged event object. That object should contain an action property equal to "click", an action_source property equal to "automatic_translation_header" and an action_subtype property equal to "close".

If this is the case, it means that the last remaining event is also instrumented properly and this task can be closed as done.

@ngkountas Thanks for the QA steps! As seen in AC5 gif, when I clicked on the human article, it did not generate. The rest logs and looks fine though.

Test Result - ES AutomaticTranslation PROD

Status: ❌ FAIL AC5
Environment: ES AutomaticTranslation PROD
OS: macOS Sonoma 15.0
Browser: Chrome 129
Device: MBA
Emulated Device: NA

Test Artifact(s):
https://es.wikipedia.org/wiki/Especial:AutomaticTranslation

Test Steps

Pre-requisite:

-Ensure debug logs are enabled by executing the following code in the browser developer console:
javascript
Copy code:

mw.loader.using( 'mediawiki.api' ).then( function () {
  new mw.Api().saveOption( 'eventlogging-display-web', '1' );
});

Test Case 1: Session initiation

  1. Visit a Wikipedia site (e.g., https://es.wikipedia.org).
  2. Open the Automatic Translation special page.
  3. ✅ AC1: Confirm that a "session initiation" event is logged in the console, containing the relevant session properties.

session_init (session initiation):

2024-09-27_13-15-43.mp4.gif (1×1 px, 1 MB)

Test Case 2: User searches for a topic

  1. On the Automatic Translation page, search for a topic.
  2. ✅ AC2:Confirm that a "search_topic" event is logged in the console, including properties like action: "search", action_source: "topic_search".

search (users searches for a topic):

2024-09-27_13-16-21.mp4.gif (1×1 px, 1 MB)

Test Case 3: User selects an article

  1. After searching for a topic, select an article from the search results.
  2. ✅AC3:Confirm that a "select_article" event is logged, with properties like action: "click", action_source: "article_selection".

"search_result" click (users selects an article):

2024-09-27_13-16-47.mp4.gif (1×1 px, 1 MB)

Test Case 4: User clicks to view automatic translation

  1. After selecting an article, click to view the automatic translation.
  2. ✅ AC4:Confirm that a "view_automatic_translation" event is logged, including properties such as action: "click", action_source: "automatic_translation_card".

"auto_translation_card" click (user clicks to view to automatic translation):

2024-09-27_13-17-21.mp4.gif (1×1 px, 2 MB)

Test Case 5: User clicks to view human-generated content

  1. Click on the option to view human-generated content (if available).
  2. ❌ AC5:Confirm that a "view_human_content" event is logged, with properties like action: "click", action_source: "human_content".

Human translation card did not display as seen in the gif. Is it because it opens a new tab?

"human_translation_card" click (user clicks to view human generated content):

2024-09-27_13-17-56.mp4.gif (1×1 px, 2 MB)

Change #1076739 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] AX MVP: Instrument "open target article" inside view translation

https://gerrit.wikimedia.org/r/1076739