Page MenuHomePhabricator

Measure the number of different wikis from which users have unread notifications
Closed, DeclinedPublic

Description

When thinking of supporting cross-wiki notifications we can imagine different kind of cross-wiki patterns (e.g., user mainly participating in a project across a small group of languages, participating on many different projects in the same language, etc.). Currently, we don't know how often those patterns are happening.

To get a better picture we can measure the number of users with unread notifications from different wikis. In the form of "X users with unread notifications from Y different wikis" for different values of Y (example scale: 1 wiki, 2 wikis, 3-5 wikis, 5-10 wikis, 10-20 wikis, 20+ wikis).

We can adjust the queries to group wikis by project (e.g., all "Wikipedias"),or limit them to those notifications generated in the last couple months.

Event Timeline

Pginer-WMF raised the priority of this task from to Needs Triage.
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF added subscribers: Etonkovidova, DannyH, Jay8g and 4 others.
Catrope triaged this task as High priority.Sep 30 2015, 11:44 PM
Catrope set Security to None.
Catrope added a subscriber: Catrope.
DannyH removed a subscriber: DannyH.Oct 5 2015, 10:48 PM
Neil_P._Quinn_WMF added a comment.EditedOct 8 2015, 12:39 AM

@Pginer-WMF, how important is the "unread" part of this? It should be possible to get numbers of unclicked notifications (although not over the past couple weeks due to T114833), but I think it would be much easier to get "X users with at least two notifications generated in the last month on each of Y different wikis" or something like that.

@Neil_P._Quinn_WMF: I think the echo_event table might have some of the relevant data for that?

@Pginer-WMF, how important is the "unread" part of this? It should be possible to get numbers of unclicked notifications (although not over the past couple weeks due to T114833), but I think it would be much easier to get "X users with at least two notifications generated in the last month on each of Y different wikis" or something like that.

The idea behind "unread" was to get the notion of how many notifications users get from different wikis "at the same time". I wanted to focus on the notifications simultaneously present in the notifications panel (those unread) in order to avoid our results just to become a total count of all the wikis the user ever visited. In short, what I need to know is how much information and from how many wikis we need to optimally support in the notification panel.

Limiting them by the time they are generated, as you proposed, seems also a valid approach to me. The only aspect we are missing is that a user getting notifications from 10 different wikis in a month may be checking them quick enough to never get more than 2 in the notification panel at the same time, or not. But I think it could be a good enough approximation if it is much easier to get.

Thanks @Catrope! It looks like I can get read status pretty easily from the echo_notification tables. Unfortunately, to combine users across wikis I'd need usernames, but that table only stores user IDs, and the user tables aren't even on X1 analytics-slave, so I don't think I can join for that either.

Is there a machine I can access that stores both? If not, I may just have to use the event logs, which would limit me to seeing clicks on notifications with links.

Thanks @Catrope! It looks like I can get read status pretty easily from the echo_notification tables. Unfortunately, to combine users across wikis I'd need usernames, but that table only stores user IDs, and the user tables aren't even on X1 analytics-slave, so I don't think I can join for that either.
Is there a machine I can access that stores both? If not, I may just have to use the event logs, which would limit me to seeing clicks on notifications with links.

I don't think so :S but when I last talked to someone from analytics they said they believed analytics-store was supposed to have all tables and all DBs, for easy joining. It sounded like flowdb not being on analytics-store may have been an oversight, and perhaps the Echo tables not being there (in short, everything that's on x1 not being on analytics-store) may be an oversight as well.

In terms of different languages there could be some correlation with the research about multi-language editing patterns in Wikipedia. A research I mentioned in a recent meeting, in case it can be helpful.

jcrespo changed the status of subtask T115275: Replicate Echo tables to analytics-store from Open to Stalled.Feb 3 2016, 9:44 AM

@jmatazzoni Would this data still be useful to Global Collaboration team? I don't have space to work on it right now, but I'd like to know whether it should stay on the backlog.

Neil_P._Quinn_WMF lowered the priority of this task from High to Normal.Oct 20 2017, 9:17 PM
Deskana lowered the priority of this task from Normal to Low.Oct 31 2017, 2:54 PM
Neil_P._Quinn_WMF closed this task as Declined.Apr 20 2018, 5:48 PM

@jmatazzoni Would this data still be useful to Global Collaboration team? I don't have space to work on it right now, but I'd like to know whether it should stay on the backlog.

It sounds like the answer is no :)