Page MenuHomePhabricator

Identify metrics to be presented on the unified CX dashboard
Closed, ResolvedPublic

Description

The v1 of the planned of the unified CX dashboard, at a minimum should consolidate the metrics from


MetricVisualization TypeData SourceCurrent Update FrequencyFor Unified CX DashboardRequired Update Frequency
CX key metrics
Total translations since deploymentBig numberedit_hourlymonthlykeepdaily
Total translations for the selected time rangeBig numberedit_hourlymonthlykeepdaily
Translations last calendar monthBig number with trendlineedit_hourlymonthlyremove[1]-
Monthly translations year-over-yearLine chart (+ time comparison yoy)edit_hourlymonthlykeepmonthly
User last calendar monthBig number with trendlineedit_hourlymonthlyremove[2]-
Monthly translations by user edit countLine chart (dimension: edit bucket)edit_hourlymonthlykeepmonthly
Total translations deleted for selected time rangeBig numberedit_hourlymonthlykeepdaily
Total translations reverted for selected time rangeBig numberedit_hourlymonthlykeepdaily
Monthly rate of deleted translationsLine chartedit_hourlymonthlykeepmonthly
Monthly rate of translations published high unmodified machine-translated textLine chart (dimension: edit status)edit_hourlymonthlykeepmonthly
Monthly translations by wiki (top 10)Line chart (dimension: wiki_db)edit_hourlymonthlykeepmonthly
Average new translator retention for selected time rangeBig numbermediawiki_historymonthlykeepmonthly
Total number of new users that published a translation 30 days after registrationBig numbermediawiki_historymonthlykeepmonthly
New translator retention last monthLine chart (+ time comparison yoy)mediawiki_historymonthlykeepmonthly
Monthly new translator retention year over yearBig number with trendlinemediawiki_historymonthlykeepmonthly
Special:CXStats
Total published translations all time (for a given wiki)Big numbercx_translationsreal-timeremove[3]daily
Total published translations last week (for a given wiki)Big numbercx_translationsreal-timeremove[4]daily
Translation to all languagesLine chart (dimension: published/in progress)cx_translationsreal-timeremove[5]daily
Translation to given wikiLine chart (dimension: published/in progress/deleted)cx_translationsreal-timeremove[6]daily
Published translations (from source to targets)Stacked bar chart (dimension: target)cx_translationsreal-timekeepdaily
Published translations (from targets to source)Stacked bar chart (dimension: source)cx_translationsreal-timekeepdaily
Translations in progress (from source to targets)Stacked bar chart (dimension: target)cx_translationsreal-timekeepdaily
Translations in progress (from targets to source)Stacked bar chart (dimension: source)cx_translationsreal-timekeepdaily
Number of translators (from source to targets)Stacked bar chart (dimension: target)cx_translatorsreal-timemodify[7]daily
Number of translators (from targets to source)Stacked bar chart (dimension: source)cx_translatorsreal-timemodify[8]daily
CX deletion stats
Overall CX articles deletion ratio-mediawiki_historyquarterlykeeptbd
Overall non-CX articles deletion ratio-mediawiki_historyquarterlykeeptbd
Number of wikis with higher cx article deletion ratios-mediawiki_historyquarterlykeeptbd
Wiki with the highest deletion ratio difference-mediawiki_historyquarterlykeeptbd
Breakdown of deletion ratios by wiki (with higher cx deletion ratio)Tablemediawiki_historyquarterlykeeptbd
  • [1][2]: Both these numbers can be gathered from the adjacent charts of monthly translations yoy.
  • [3]: Already covered under key metrics: Total translations since deployment
  • [4]: Instead of defaulting to a wiki, "Total translations for the selected time range" will let users to select whatever period they're interested in.
  • [5][6]: Displaying all wikis irrespective of the filter might not be useful. "Monthly translations year-over-year" shows the information, but works based on the filter.
  • [7][8]: Combine these with translator metrics from the key metrics. It is better to have monthly unique translation with a year-over-year time window for comparision.

List of other open tasks that can potentially be addressed as part of this development

TaskNotesCan it be addressed with unified CX dashboard v1?
T100029: Show weekly and monthly translations on Special:ContentTranslationStatsplanned for a selected time range based on the filter, but we can also show unfiltered monthly / weekly numbers if needed.yes
T94806: Special:ContentTranslationStats: clarify the meaning of the datesfor the dashboard and data aggregations, we will follow the modelling guidelines - it should address this taskyes
T100033: Show article creation tool comparison in Special:ContentTranslationStatsthis can be a useful metric to understand adoption CX among other toolsyes
T100034: Show weekly and monthly user overview in Special:ContentTranslationStatscan be part of "Translators" tab of the dashboardyes
T100035: Show users grouped by number of translations on Special:ContentTranslationStatswe currently have this by user edit bucketno
T103983: Special:ContentTranslationStats at Simple English Wikipedia shows data for Englishshould be fixed with proper filters and scopingyes
T117855: Rethink the display of deleted vs created pages in Special:ContentTranslationStats and monthly reportsinternally this is already addressed with having "Translation status" filteryes
T113337: Red deleted translations bar in Special:ContentTranslationStats looks like negative when the value is zero-yes
T137700: Provide pan and zoom options on Special:ContentTranslationStatsSuperset Echarts provides decent interactivity; not sure of pan/zoommaybe
T123898: Add scrolling to the graphs and charts on Special:ContentTranslationStats-maybe
T123859: There is no convenient way to see a list of articles translated from a certain language using Special:ContentTranslationStatsinternally this is already addressed with having "Translation status" filteryes
T88281: Content Translation dashboard does not reflect real state of published article--
T113336: Measure abandoned translationsdependent on funnel analysis using CX eventno
T228152: Monitor translations to identify relevant activity on specific wikiskey metrics address this(?)maybe
T185110: Inconsistent background color on Special:CX and Special:CXStats-maybe
T94385: Improve visualisations of Content Translation statistics-yes

Related Objects

Event Timeline

@Pginer-WMF I have consolidated all the metrics we are currently tracking across dashboards / reports and are good to have for the first version. Additionally, from this search, I have listed the open tickets and if we can addressed those with the first version of the dashboard. Please review and share your thoughts.

KCVelaga_WMF changed the task status from Open to In Progress.May 29 2024, 7:19 AM
KCVelaga_WMF triaged this task as Medium priority.

Listing down all the metrics for future reference (priority can be discussed and changed)


Machine translation service usage report (source: cx_corpora, cx_translations, & mediawiki_history; frequency: ad-hoc)

MetricVisualization TypePriority for Unified CX Dashboard
Published translations by machine translations service across all language pairsTable & Bar Graphv2
^ + year-over-year comparisonTablesv2
Daily published translations by machine translation service (last six months)Line Chart (by MT service)v2
Median number of daily translations, by serviceTablesv2
^ + year-over-year comparisonv2
Usage-by-language pairv2
Language pairs where an optional service was used more or close to the defaultTablesv2
"MT usage at each target language: breakdown of machine translation services use by target language and which are being most used"100% Stacked Bar Chart (by target language)v3
Percentage of content modified by MT service (published translations)100% Stacked Bar Chart (by MT service)v3
Percent of articles that are created with each MT service and deletedTablesv3
Deletion rates at MinT supported languages, by wikiBar Graphs (by language)v2
Year over year service usage comparison (by each service)Tablesv3

CX Abuse Filter events (source: wmf_product.cx_abuse_filter_daily)

MetricVisualization TypePriority for Unified CX Dashboard
Monthly CX abuse filter counts (+YoY)Line Chartv2
Top five wikis by monthly AbuseFilter trigger countsLine Chartv2
Top 10 abuse filters that were triggered the mostTablev3

CX funnel analysis (v1 - entry points) (frequency: ad-hoc)
Note: This is dependent on the instrumentation of CX event, it is better to build the entire after all the key events are instrumented.

MetricVisualization TypePriority of Unified CX Dashboard
Overall Distribution Of Entry Points That Users Navigate FromTablev3
Usage of Entry Points by User Global Edit Bucket100% Stacked Bar Chart (by edit bucket)v4
Usage of entry points by Comparative Wikipedia Size of the Target Language100% Stacked Bar Chart (by edit bucket)v4
Frequency of sources for dashboard translation startTablev4
User flow funnel through key CX eventsSankey Diagramv4 or later
^ overallSankey Diagram + Tablev4 or later
^ by user edit bucketSankey Diagram + Tablev4 or later
^ by entry pointSankey Diagram + Tablev4 or later
Transitions to Various Stages of Translation Funnel by Source of Entry to the DashboardTablev4 or later

@Pginer-WMF I have consolidated all the metrics we are currently tracking across dashboards / reports and are good to have for the first version. Additionally, from this search, I have listed the open tickets and if we can addressed those with the first version of the dashboard. Please review and share your thoughts.

Thanks for the detailed evaluation. The adjustments proposed for merging the multiple dashboards/reports available make perfect sense.
Regarding, T228152, I think the current approach in the Key metrics dashboard will cover the main needs. In particular the "Monthly translations by wiki" graph has been useful to identify which wikis caused a spike of activity (e.g., a campaign in Bangla Wikipedia)

Another report that would be relevant to consider integrating is the Topic Diversity of Published Translations. As we provide support for topic-based suggestions(T113257), having an understanding about the topics associated with the published translations can be useful in this context.

Definitely not a blocker for the initial version. Sharing it since other reports were mentioned for future versions in T366044#9897587

I wanted to share a couple of proposed aspects to measure that have been surfaced in several conversations in the past as the impact of translations have been analyzed:

  • Distribution of translated content across translators. We want to understand how are translations distributed across the users making them, and how much content do they translate. This will help to identify different groups of our audience and how representative of the whole audience they are. For example: a group of users creating many short translations, another making a few translations that cover most of the original article, etc.
  • Impact on readers and other editors. For content created as a translation, it would be useful to know how many views does it generate and how often other editors build on top of the translation to continue editing (i.e., increasing the coverage on the topic by continuing where the translation left).

@Pginer-WMF these are definitely good points. But given the scope of the building the first version of the public dashboard, let's plan these as expansions during the upcoming quarters.

Distribution of translated content across translators. We want to understand how are translations distributed across the users making them, and how much content do they translate. This will help to identify different groups of our audience and how representative of the whole audience they are. For example: a group of users creating many short translations, another making a few translations that cover most of the original article, etc.

I am wondering if this will have much value being in a dashboard and monitoring continuously, vs. a one time analysis.

@Pginer-WMF these are definitely good points. But given the scope of the building the first version of the public dashboard, let's plan these as expansions during the upcoming quarters.

Distribution of translated content across translators. We want to understand how are translations distributed across the users making them, and how much content do they translate. This will help to identify different groups of our audience and how representative of the whole audience they are. For example: a group of users creating many short translations, another making a few translations that cover most of the original article, etc.

I am wondering if this will have much value being in a dashboard and monitoring continuously, vs. a one time analysis.

Good question. I'd say that it depends on whether the plans for the team are focused towards affecting the aspect we measure. For example, if the goal is about encouraging people to make higher quality articles, it may be useful to see how the distribution of translated content per user evolves. Obviously, this also depends on the differences of efforts required for the measurements on each approach.

Maybe we can consider something along these lines when measuring the impact of future hypothesis if any is related.