Priority metric improvements: Programs metrics by geo
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	JAnstee_WMF
	Mar 29 2022, 9:20 PM

Related Objects
Search...

Status	Assigned	Task
Resolved	KCVelaga_WMF	T304998 Priority metric improvements: Programs metrics by geo
Resolved	KCVelaga_WMF	T310471 Program data processing notebook
Resolved	KCVelaga_WMF	T313479 Finalize approach for geo location for users having more than one country logged

Event Timeline

JAnstee_WMF created this task.Mar 29 2022, 9:20 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 29 2022, 9:20 PM

JAnstee_WMF assigned this task to KCVelaga_WMF.May 16 2022, 4:30 PM

JAnstee_WMF moved this task from Backlog to In Development on the Equity-Landscape board.

JAnstee_WMF moved this task from In Development to Blocked or Needs Discussion on the Equity-Landscape board.May 31 2022, 2:09 PM

After some initial exploring of data from WikiEd and Event Metrics, here are outstanding questions/issues that need discussion:

Organizers and participants of events with different start and end year
1. currently, start year
Users who logged location in multiple countries
1. currently, only one (the latest is being considered)
How can we account for unavailability of data in final outputs (if at all)?
EventMetrics usage seems to be much larger than for events/programs
90 day availability of geo data will be helpful!
For improvement of match %, we will need to explore storing individual editors' geo-data for a longer period.

Sharing my thoughts briefly here:

On points 1 & 2 my recommendation is to be inclusive to include the usernames in multiple countries and/or years - Similarly, if a user name is a campaign organizer as well as a participant or course organizer, etc. I would include them for each, in all relevant bins.

Regarding Q3 - it would seem that until we better understand the data through repeat measures we should continue to triangulate these metrics with other signals as we work to understand and develop the new data pipeline.

Regarding point 4 and the very low hit rate from Event Metrics usage data, it feels like this is too varied a use space, partly because it offers a non-public way of cohort tracking; and without public transparency, like the dashboards offer, it contains a wide variety of "events", many of which are not events at all but data mapping efforts related to the tool's functionality. Further, with much lower hit rates, it also does not appear to offer a representative pulse point - I recommend we focus on the P&E dashboards for our metric use case rather.

Regarding point 5 - yes, agreed. We discussed also that in the meantime, the hit rates for the P&E dashboard usage data are quite high and we should explore the existing data to understand timing and source differences in hit rates to consider potential avenues to capture more robust signal metrics within the current data pipelines further, including, pulling more recent data and querying repeatedly in a year to calculate an average hit rate for the year rather to limit timing and seasonality influences on representativeness.

Regarding point 6 - yes, and there are at least two potential routes to consider that I can think of ... which are each at a completely different scale - maybe we can have a brainstorming session about this after we explore the crosstabs of the hits further?

Update on distributions:
https://docs.google.com/document/d/1u3smPLf_pWC7xnFfJ8oyNMKtmwd8I6ihCgXZgpn4COk/edit#heading=h.2svir5qd1ff0 (Internal only)

KCVelaga_WMF moved this task from Blocked or Needs Discussion to In Development on the Equity-Landscape board.Jun 13 2022, 9:14 AM

KCVelaga_WMF closed subtask T310471: Program data processing notebook as Resolved.Jun 15 2022, 2:52 PM

KCVelaga_WMF moved this task from In Development to Blocked or Needs Discussion on the Equity-Landscape board.

Retaining the data pulled from WikiEd
Frequency of retrieval and averaging the percent ranks
Eliminating erroneous data from geo matches
- 2 countries: 13.95%
- 3 countries: 7.57%
- 4 countries: 4.96%
- 5 countries: 3.37%

JAnstee_WMF changed the task status from Open to In Progress.Jun 23 2022, 12:50 AM

JAnstee_WMF triaged this task as Medium priority.

KCVelaga_WMF moved this task from Blocked or Needs Discussion to In Development on the Equity-Landscape board.Jul 21 2022, 9:27 AM

KCVelaga_WMF closed subtask T313479: Finalize approach for geo location for users having more than one country logged as Resolved.Jul 26 2022, 3:05 PM

We have finalized the considerations and metrics to be developed for Build 1.

Priority metric improvements: Programs metrics by geoClosed, ResolvedPublicActions

Related ObjectsSearch...

Event Timeline

Priority metric improvements: Programs metrics by geo
Closed, ResolvedPublic
Actions

Related Objects
Search...