Page MenuHomePhabricator

Analyze and publish results of metrics testing
Closed, ResolvedPublic

Description

Work planned for January 2025:

  • Analyse the test data we generated in T379431
  • Evaluate whether the doc characteristics we measured are useful as indicators of the metrics we care about
  • Publish data and results of assessment

Event Timeline

TBurmeister changed the task status from Open to In Progress.Jan 15 2025, 8:20 PM
TBurmeister triaged this task as Medium priority.

Weekly status update:

Cleaned up and formatted test dataset. Identified and created benchmarks for some data elements, and revisited how to implement weights and scoring of individual elements for metrics computation. Completed 80% of analysis and report generation for aggregate statistics (for the entire test dataset). Next week: finish analysis/report for aggregate dataset and test metrics computation. Start on collection-level analysis and report generation.

Weekly status update:

  • Created basic python scripts to import test data from CSV, calculate metrics from the data, and output scores.
  • Implemented the first metrics calculation function for processing test data, and confirmed that the basic logic is working. Next week I will be able to more quickly implement functions for all the other calculations now that I have a working example.

Weekly status update:

  • Finished implementation of python scripts to output calculations for all test metrics.
  • Generated metrics output for the test dataset: raw output is computed based on 30 doc attributes across 140 pages, for 7 metrics categories. (raw output data for test set of pages is attached to this task as a csv)
  • Added doc collections to raw output to enable analysis of metrics scores for each of the 5 collections in our test set.
  • Began analysis of the test metrics output!

Next week: Finish intitial analysis and discuss {F58325103}outcomes with Tech Docs team. Stretch: create a plan for gathering feedback from community and stakeholders.

A version of the metrics test output with an additional field indicating each pages's collection membership is now available at https://gitlab.wikimedia.org/repos/technical-documentation/doc-metrics-testing/-/blob/develop/data/full_output%20with%20collections.csv

I have now published comprehensive reference documentation covering the fields in our test dataset (the metrics input), and the fields in the metrics output: https://www.mediawiki.org/wiki/Wikimedia_Technical_Documentation_Team/Doc_metrics/Field_reference

Next week I will be working with the other technical writers to try out some persona-based testing of the metrics output, and writing a short-and-sweet user guide (see: draft). When that is done, we'll be ready to share all this work more widely and formally request feedback from the larger community and stakeholders.

Outcomes of the test and analysis of its results are published at: https://www.mediawiki.org/wiki/Wikimedia_Technical_Documentation_Team/Doc_metrics/v0.
Moving on to the testing and feedback phase, tracked in T386254.