Page MenuHomePhabricator

Superset: extend DiscussionTools dashboard to include topic subscription metrics
Closed, ResolvedPublic

Description

This task involves the work with extending the existing DiscussionTools Superset dashboard @MNeisler created [i] to include metrics about high-level topic subscription (T263820 + T263819) usage.

Objectives

The metrics described in the ===Requirements section below are intended to help the Editing Team identify unexpected trends in usage of Manual (T263820) and Automatic Topic Subscriptions (T263819) that will serve as a signal that further investigation is needed.

Requirements

  • ADD the following two charts to the existing DiscussionTools Superset dashboard [i]:
    • Chart 1: a graph showing the total number of topic subscriptions people have initiated to date, filterable by wiki and how the topic subscription was initiated (manually or automatically).
    • Chart 2: a graph showing the total number of topic subscriptions that are currently ACTIVE, filterable by wiki and how the topic subscription was initiated (manually or automatically).
  • "Chart 1" and "Chart 2" should be implemented such that they are populated with real-time data.
    • Note: if this requirement surpasses Superset's data limits, we will adjust the charts such that they are updated at a slower cadence (e.g. weekly, bi-weekly, monthly, etc.).

Done

  • The DiscussionTools Superset dashboard [i] is updated to include the two charts described in the ===Requirements section above

i. https://superset.wikimedia.org/r/632

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
Resolved ppelberg
Resolved ppelberg
Resolved ppelberg
OpenNone
OpenNone
Duplicate ppelberg
OpenNone
Resolved ppelberg
Resolved iamjessklein
OpenMNeisler
OpenMNeisler
Resolved ppelberg
DeclinedNone
OpenNone
OpenNone
OpenNone
Resolved ppelberg
ResolvedEsanders
OpenNone
Resolved LZaman
Resolved iamjessklein
DuplicateNone
ResolvedJTannerWMF
ResolvedEsanders
Resolved ppelberg
Resolved ppelberg
Resolved iamjessklein
ResolvedMNeisler
Resolved ppelberg
Resolvedmatmarex
Resolvedmatmarex
ResolvedEsanders
Resolvedmatmarex
Resolvedmatmarex
ResolvedRyasmeen
ResolvedEsanders
Resolved ppelberg
ResolvedEsanders
Resolvedmatmarex
ResolvedUrbanecm
DuplicateNone
Resolved ppelberg
DuplicateNone
ResolvedRyasmeen
Resolvedmatmarex
Resolved Whatamidoing-WMF
Resolved ppelberg
Resolved iamjessklein
Resolvedmatmarex
Resolvedmatmarex
Resolved ppelberg
Resolved iamjessklein
Resolved iamjessklein
ResolvedEsanders
ResolvedEsanders
Resolved ppelberg
Resolved ppelberg
Resolvedmatmarex
Resolved ppelberg
Resolvedmatmarex
ResolvedRyasmeen
Resolvedmatmarex
Resolvedmatmarex
Resolved ppelberg
Resolved ppelberg
OpenNone
DuplicateNone
InvalidNone
ResolvedEsanders
DuplicateMNeisler
Resolved ppelberg

Event Timeline

MNeisler triaged this task as Medium priority.Jul 22 2021, 3:07 PM
MNeisler moved this task from Triage to Upcoming Quarter on the Product-Analytics board.

I'd like to know how many sections have been subscribed to at enwiki (not necessarily counting mine, because I'm using it a lot). We had an active discussion about this starting last Friday (27 August 2021), so I think that 26 August is the baseline state. I'm curious whether this discussion will result in much wider use on other pages.

@ppelberg I've updated the DiscussionTools Dashboard (currently renamed to Talk Pages Project Dashboard) to include the requested charts on topic subscription data. Data comes reflects a current snapshot of topic subscriptions as of 31 August 2021.

I'd like to do a little more QA of the data so keeping this assigned to me right now but please let me know if you have any questions or suggested changes to the proposed charts. See details about the charts and details below:

  • Chart Descriptions: For both initiated and active topic subscriptions, I've currently included charts that show the overall totals as of 31 August 2021 and daily totals by the date the subscription was created. Per @Whatamidoing-WMF's suggestion in T287126#7319535, I also included a chart to show the number of distinct sections/topics have been subscribed to. All charts are filterable by wiki using the provided Topic Subscriptions Filter on the dashboard.
  • Filters: All charts are filterable by wiki using the provided Topic Subscriptions Filter on the dashboard. There is also an 'initiation type' filter to view data by how the topic subscription was initiated (manually or automatically). However, data currently only reflects manual topic subscriptions pending implementation of automatic topic subscriptions.
  • Frequency of Updates These charts are unable to be updated in real-time as the data source for these charts comes from a query that currently needs to be manually run and then uploaded into Superset. This process does not take long but will require some planning and discussion on the needed frequency of updates since it's not automated. If needed frequently, a job scheduler can eventually be set-up to automate this process.
  • Data comes from the discussiontools_subscription table, which stores a creation timestamp and state of subscription but does not track changes in user preferences over time (for example, if a user is currently unsubscribed to a topic, we will not be able to determine if that user was subscribed manually or automatically). Implications for these charts: Only the active topic subscriptions charts are filterable by initiation type.

@ppelberg - I completed QA and confirmed that the data in Topic Subscriptions charts is correct. Moving to you for review and final sign-off. Please let me know if you have any questions or suggested changes.

@ppelberg I've updated the DiscussionTools Dashboard (currently renamed to Talk Pages Project Dashboard) to include the requested charts on topic subscription data. Data comes reflects a current snapshot of topic subscriptions as of 31 August 2021.

  • Chart Descriptions: For both initiated and active topic subscriptions, I've currently included charts that show the overall totals as of 31 August 2021 and daily totals by the date the subscription was created. Per @Whatamidoing-WMF's suggestion in T287126#7319535, I also included a chart to show the number of distinct sections/topics have been subscribed to. All charts are filterable by wiki using the provided Topic Subscriptions Filter on the dashboard.

This looks GREAT, @MNeisler!

To be doubly sure I am understanding these charts accurately, can you please let me know what – if anything – about the below is inaccurate?

  • Initiated topic subscriptions (total): a count of the total number of times people have subscribed to a topic. For right now, this chart is limited to manual topic subscriptions. Eventually, it will also include automatic topic subscriptions.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, the chart will show a count of "2 initiated topic subscriptions" on this day.
  • Active topic subscriptions by date created: the total number of times people have subscribed to a topic minus the total number of times people have unsubscribed from topics.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, and then Person A unsubscribes from Topic 1, the chart will show a count of "1 active topic subscriptions" on this day.
  • Distinct topics subscribed to (total): a count of the distinct conversations at least one person has subscribed to.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, the chart will show a count of "1 distinct topics subscribed to" on this day.
  • Filters: All charts are filterable by wiki using the provided Topic Subscriptions Filter on the dashboard. There is also an 'initiation type' filter to view data by how the topic subscription was initiated (manually or automatically). However, data currently only reflects manual topic subscriptions pending implementation of automatic topic subscriptions.

Noted.

  • Frequency of Updates These charts are unable to be updated in real-time as the data source for these charts comes from a query that currently needs to be manually run and then uploaded into Superset. This process does not take long but will require some planning and discussion on the needed frequency of updates since it's not automated. If needed frequently, a job scheduler can eventually be set-up to automate this process.

Understood. The follow up work needed to ensure the topic subscription data updates in real-time is now being tracked in this newly-filed ticket: T290516.

  • Data comes from the discussiontools_subscription table, which stores a creation timestamp and state of subscription but does not track changes in user preferences over time (for example, if a user is currently unsubscribed to a topic, we will not be able to determine if that user was subscribed manually or automatically). Implications for these charts: Only the active topic subscriptions charts are filterable by initiation type.

Understood.

@ppelberg

To be doubly sure I am understanding these charts accurately, can you please let me know what – if anything – about the below is inaccurate?

  • Initiated topic subscriptions (total): a count of the total number of times people have subscribed to a topic. For right now, this chart is limited to manual topic subscriptions. Eventually, it will also include automatic topic subscriptions.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, the chart will show a count of "2 initiated topic subscriptions" on this day.
  • Active topic subscriptions by date created: the total number of times people have subscribed to a topic minus the total number of times people have unsubscribed from topics.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, and then Person A unsubscribes from Topic 1, the chart will show a count of "1 active topic subscriptions" on this day.

For these two statements above, it's not completely accurate to state "the total number of times people have subscribed or unsubscribed from a topic". We currently only keep data on the current state of each person's subscription and do not know if that person changed their subscriptions status for that topic multiple times. For example, so if Person A unsubscribes from Topic 1 and then decides to resubscribe to Topic 1 later and Person B stayed subscribed the whole time, both of these subscriptions states would be reflected as "2 active subscriptions" in the data. Also, in this scenario, Person A subscribing to Topic 1 would be counted as just 1 initiated topic subscription even though they technically subscribed to the same topic multiple times.

It would be more accurate to say the following:
Initiated topic subscriptions (total): "the total number of times people first subscribed to a topic" or "the total number of times people initiated a topic subscription"
Active topic subscriptions by date created: "the total number of times people first subscribed to a topic minus the current total of topic subscriptions that are currently set as unsubscribed "

  • Distinct topics subscribed to (total): a count of the distinct conversations at least one person has subscribed to.
    • E.g. if Person A and Person B both subscribe to Topic 1 on 1-September, the chart will show a count of "1 distinct topics subscribed to" on this day.

Correct.

How to interpret the Line Graphs
Also, a quick note on how to read the dates in the line charts. For all line charts, the dates on the x-axis reflect the date the subscription was first created/initiated since the data is based on a snapshot and we don't currently record changes in an initiated topic subscription over time. This can make interpreting the "Active topic subscriptions by date created" slightly more confusing to read (If this is more confusing than beneficial, we might want to consider removing this chart and just keeping the "Active topic subscriptions (total)" chart instead).

I've provided examples on how to read each chart below for the date of July 10th:

"Initiated topic subscriptions by date created": There were 25 topic subscriptions created on July 10th.
"Active topic subscriptions by date created": 24 of the topic subscriptions created on July 10th are still active as of 31 August 2021.
"Distinct topics subscribed to by date created": There were 25 distinct topics that were subscribed to on July 10th.

Per the meeting @MNeisler and I had on 9 September, all that's left to be done on this task is for Megan to update the Talk Pages Project > Topic Notifications dashboard with the clarifying language she shared in T287126#7342081.

Per the meeting @MNeisler and I had on 9 September, all that's left to be done on this task is for Megan to update the Talk Pages Project > Topic Notifications dashboard with the clarifying language she shared in T287126#7342081.

@ppelberg
This is complete. I also went ahead and updated the topic subscriptions dashboard so it reflects data as of 16 September 2021.
Please let me know if you have any questions or additional modifications to complete as part of this task.