Page MenuHomePhabricator

Expand CX event schema for Community-defined Translation Collections & Custom suggestion features
Closed, ResolvedPublic

Description

Schema: https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/refs/heads/master/jsonschema/analytics/mediawiki/content_translation_event/

The goal of this task is to think about how the events related to the two new features of the Content translation in development "Community-defined translation lists" & "Custom translation suggestions" should be modelled using the current CX event schema, and update the schema accordingly.

The actual instrumentation tasks are

During the process, also consider the instrumentation needed for T373431: Measure the effects of limiting fast unreviewed translations

Event Timeline

@Pginer-WMF @PWaigi-WMF I created this for me to work on modelling the events using the schema, as it will be done together for all the upcoming features, rather than multiple updates. The actual tasks can be kept for engineering work related to the instrumentation.

KCVelaga_WMF changed the task status from Open to In Progress.Sep 16 2024, 10:54 AM
KCVelaga_WMF moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

Based on the designs, I started drafting the changes/additions to the schema on this spreadsheet.

KCVelaga_WMF added a subscriber: SGautam_WMF.

@Pginer-WMF @PWaigi-WMF @SGautam_WMF

I have a finalized the schema expansion from my side for all the features in-development or being planned for in the near future. Please review and see if things make sense, and especially if there are any steps that I might be missing in the user flow.

PWaigi-WMF renamed this task from Expand CX event schema for community-defined lists & custom suggestion features to Expand CX event schema for Community-defined Translation Collections & Custom suggestion features .Nov 5 2024, 11:45 AM

Based on my previous comment at T376691#10298411, and conversation with @ngkountas

I propose the following new events to be captured

screenrelated touser actionevent_typeevent_subtypeevent_sourceevent_context (str)
dashboard homecustom suggestionsuser selects a quick alternative to the selected filter: "For you" or "Popular"dashboard_suggestion_filters_quick_select-yes
dashboard homecustom suggestionsusers selects to view "---More" filters for translation suggestionsdashboard_suggestion_filters_view_more--
Adjust suggestionscustom suggestionsuser selects a suggestions filtersuggestion_filters_selectsuggestion_filters_single_select, suggestion_filters_multi_selectyesname of the topic area or the collection selected (as a string)
Adjust suggestionscustom suggestionsuser de-selects a suggestions filtersuggestion_filters_deselect-yesname of the topic area or the collection de-selected (as a string)
Adjust suggestionscustom suggestionsuser confirms the selected topic (i.e. user clicks "Done" button)suggestion_filters_confirmsuggestion_filters_single_select_confirm, suggestion_filters_multi_select_confirm-in case of single-select, this will just be a string, whereas for multi-select this will a string containing names of the topic areas, collections etc. selected (each topic area or collection name should be seperated by a semi-colon)
  • For single select and multi-select sub-types, it is to be based on whichever the mode the user is in.

Exhaustive list of high-level suggestion filter groups (to be captured as event source):

  • suggestion_filter_previous_edits
  • suggestion_filter_topic_area
  • suggestion_filter_collections (T378958)
  • suggestion_filter_vital_articles (T374597)
  • suggestion_filter_search_result_seed (T369595)

Let me know if there any other high level suggestion filter groups that are not listed above.

Following adjustments to be made to the existing events:

screenrelated touser actionevent_typeevent_subtypeevent_source(s)event_context (str)contextual fieldsnotes
dashboard home -> translation startgeneral workflowusers selects an article from the suggestions and proceeds to translatedashboard_translation_start-corresponding source to be captured (from the above mentioned list)in case of single-select, this will just be a string, whereas for multi-select this will a string containing names of the topic areas, collections etc. selected (each topic area or collection name should be seperated by a semi-colon)translation_type
dashboard homegeneral workflowuser opens the dashboard by directly accessing a URL with pre-selected filtersdashboard_open-suggestion_filter_direct_preselect (T369012)Filters active at the time of opening the dashboard, to be recorded as semi-colon separated stringsexamples: "related_edits; nearby_topics; art; india", "all_lists; fashion; food and drink", "related_edits; Women in science week"

I will start a draft a MR, and we can iterate based on the review.

Exhaustive list of high-level suggestion filter groups (to be captured as event source):

  • suggestion_filter_previous_edits
  • suggestion_filter_topic_area
  • suggestion_filter_collections
  • suggestion_filter_vital_articles
  • suggestion_filter_search_result_seed

Let me know if there any other high level suggestion filter groups that are not listed above.

Do we need one filter group for 'popular' options? Maybe something like suggestion_filter_popular_topics. Or is one from the list about meant to be for popular option? I'm assuming suggestion_filter_topic_area is for any one specific topic selected from all the options presented.

If there will be a popular topics group filter, then we should add that too - is that confirmed?

suggestion_filter_topic_area is for topic areas such as engineering, arts, Asia etc.

Yes there is a popular filter available, @ngkountas or @SBisson can you please confirm if a specific event source for popular filter makes sense?

This is taken directly from source code at useDashboardSuggestionEventSource.js, the comment sheds light at the current status, indeed I can confirm there's suggestion_featured in the schema already:

...else if (id === POPULAR_SUGGESTION_PROVIDER) {
      // we don't have a proper event source for most popular suggestions,
      // let's use 'suggestion_featured' for now
      // TODO: Add a new event source or rename 'suggestion_featured' for most popular suggestions
      return "suggestion_featured";
    }

I agree about adding the a suggestion_filter_popular event source for dashboard_suggestion_filters_* and suggestion_filters_* events. The code you pasted above, however, is related to the dashboard_translation_start event. We should also update the list of event sources for that event too, to include the selected suggestion filters (e.g. suggestion_collections, suggestion_popular, suggestion_search_seed etc). We should also probably remove the suggestion_featured event source, as "featured" articles is a concept that exists only in design screenshots, and was never materialized.

I agree about adding the a suggestion_filter_popular event source.
We should also update the list of event sources for that event too, to include the selected suggestion filters (e.g. suggestion_collections, suggestion_popular, suggestion_search_seed etc).

I will add that.

We should also probably remove the suggestion_featured event source, as "featured" articles is a concept that exists only in design screenshots, and was never materialized.

Unfortunately, this won't be a backwards compatible change. But it raises a good point, there are several sources/sub-types that we will probably never use - all that needs a cleanup without a breaking change. We can do that separately.

I'm not sure how schema changes end up in production. Does it need to be done explicitly by someone?

That tag is for database schema changes. Here we're talking about yaml schema for event logging.

Change #1109039 had a related patch set uploaded (by KCVelaga; author: KCVelaga):

[analytics/refinery@master] Bug: T373785

https://gerrit.wikimedia.org/r/1109039

Change #1109039 had a related patch set uploaded (by Nik Gkountas; author: KCVelaga):

[analytics/refinery@master] Add event_context to sanitization allowlist for Content Translation events.

https://gerrit.wikimedia.org/r/1109039

Change #1109039 merged by Mforns:

[analytics/refinery@master] Add event_context to sanitization allowlist for Content Translation events.

https://gerrit.wikimedia.org/r/1109039