Page MenuHomePhabricator

[SPIKE] Investigate data engineering needs for dashboarding VE editing funnel
Open, MediumPublic

Description

Description
In T367130, we are working to create a dashboard of VE-related metrics so that we can monitor its overall performance/efficiency on an ongoing basis.

This task is about evaluating the possible approaches to creating this dashboard in Superset and identifying any dependencies on data platform engineering or other teams based on the selected approach.

Requirements

  • Summarise possible approaches to create a dashboard reflecting key metrics of the VE editing funnel.
  • Clarify benefits and tradeoffs with each approach
  • Confirm any dependencies on data platform engineering or other teams

Event Timeline

Here are some notes from an initial investigation of options:

  1. Superset Dashboard by directly querying editing event tables: In this approach, we would create a virtual dataset by directly querying relevant editing tables (i.e. EditAttemptStep, mediawiki_history, VisualEditorFeatureUse) in Superset using Presto.
    1. Benefits: This would not require the creation of any data pipelines or resources from data engineering. We can also add new metrics as needed and easily adjust how the metrics are defined by directly editing the query at any point.
    2. Tradeoffs: We would have to significantly limit the number of metrics tracked on the dashboard or the amount of data being collected by the query to avoid timeout issues. Superset times out if queries do not return data in about 60 seconds. This timeout would happen frequently if we tried to monitor several different metrics for all Wikipedia edit attempts. The addition of breakdowns to allow for more advanced filtering capability (e.g. by visual editor feature used) would increase the query run time even more and make it more likely that the dashboard could not be loaded before the timeout.
  2. Instrument metrics using data pipeline: In this approach, we would create a data pipeline (orchestrated with Airflow) that would run the identified queries daily (or at specified cadence) and then output aggregate data to a table. This new table would be used to create all the charts and filters needed on a Superset dashboard.
    1. Benefits.
      1. The creation and use of an aggregate data table would help avoid any timeout issues compared to directly querying the entire editing event tables. This would allow us to track many of the metrics currently being considered in T367130 within a single VE health metric dashboard.
      2. This has been done effectively for several existing dashboards. For example, an airflow data pipeline was used to generate the Automoderator dashboard T369488 (Note: This one is available on the WMCS public instance of Superset).
    2. Tradeoffs: This approach would require some additional planning and resources but can be completed by product analytics with review support from data engineering. There's also a little more effort involved in making any changes to a data pipeline once set up so we'd need to spend some more time upfront to clearly define the dashboard requirements.

The approach selected will depend on the requirements defined in T367130. Approach A may be feasible if we are able to scope down the required metrics to a couple key metrics and define a specific subset of data. Approach B is a better long-term approach if we are looking to monitor VE health across all wikis and edit attempts.

MNeisler triaged this task as Medium priority.
MNeisler moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

Thank you for bringing this all together (T399134#11014908), @MNeisler.

Per what we discussed offline last week, we would prefer to move forward with Approach #2: Instrument metrics using data pipeline.

Reasons being: we are confident the metrics we've specified are both likely to A) remain unchained and B) be useful now and into the future.

With all of the above said, before prioritizing the work "Approach #2" requires, we'll need to understand how/if xLab could be useful in this context.