Page MenuHomePhabricator

Measure the number of different wikis from which users have unread notifications
Closed, DeclinedPublic

Description

When thinking of supporting cross-wiki notifications we can imagine different kind of cross-wiki patterns (e.g., user mainly participating in a project across a small group of languages, participating on many different projects in the same language, etc.). Currently, we don't know how often those patterns are happening.

To get a better picture we can measure the number of users with unread notifications from different wikis. In the form of "X users with unread notifications from Y different wikis" for different values of Y (example scale: 1 wiki, 2 wikis, 3-5 wikis, 5-10 wikis, 10-20 wikis, 20+ wikis).

We can adjust the queries to group wikis by project (e.g., all "Wikipedias"),or limit them to those notifications generated in the last couple months.

Event Timeline

Pginer-WMF raised the priority of this task from to Needs Triage.
Pginer-WMF updated the task description. (Show Details)
Pginer-WMF added subscribers: Etonkovidova, DannyH, Jay8g and 4 others.
Catrope set Security to None.
Catrope subscribed.

@Pginer-WMF, how important is the "unread" part of this? It should be possible to get numbers of unclicked notifications (although not over the past couple weeks due to T114833), but I think it would be much easier to get "X users with at least two notifications generated in the last month on each of Y different wikis" or something like that.

@Neil_P._Quinn_WMF: I think the echo_event table might have some of the relevant data for that?

In T113626#1711340, @Neil_P._Quinn_WMF wrote:

@Pginer-WMF, how important is the "unread" part of this? It should be possible to get numbers of unclicked notifications (although not over the past couple weeks due to T114833), but I think it would be much easier to get "X users with at least two notifications generated in the last month on each of Y different wikis" or something like that.

The idea behind "unread" was to get the notion of how many notifications users get from different wikis "at the same time". I wanted to focus on the notifications simultaneously present in the notifications panel (those unread) in order to avoid our results just to become a total count of all the wikis the user ever visited. In short, what I need to know is how much information and from how many wikis we need to optimally support in the notification panel.

Limiting them by the time they are generated, as you proposed, seems also a valid approach to me. The only aspect we are missing is that a user getting notifications from 10 different wikis in a month may be checking them quick enough to never get more than 2 in the notification panel at the same time, or not. But I think it could be a good enough approximation if it is much easier to get.

Thanks @Catrope! It looks like I can get read status pretty easily from the echo_notification tables. Unfortunately, to combine users across wikis I'd need usernames, but that table only stores user IDs, and the user tables aren't even on X1 analytics-slave, so I don't think I can join for that either.

Is there a machine I can access that stores both? If not, I may just have to use the event logs, which would limit me to seeing clicks on notifications with links.

In T113626#1713332, @Neil_P._Quinn_WMF wrote:

Thanks @Catrope! It looks like I can get read status pretty easily from the echo_notification tables. Unfortunately, to combine users across wikis I'd need usernames, but that table only stores user IDs, and the user tables aren't even on X1 analytics-slave, so I don't think I can join for that either.

Is there a machine I can access that stores both? If not, I may just have to use the event logs, which would limit me to seeing clicks on notifications with links.

I don't think so :S but when I last talked to someone from analytics they said they believed analytics-store was supposed to have all tables and all DBs, for easy joining. It sounded like flowdb not being on analytics-store may have been an oversight, and perhaps the Echo tables not being there (in short, everything that's on x1 not being on analytics-store) may be an oversight as well.

In terms of different languages there could be some correlation with the research about multi-language editing patterns in Wikipedia. A research I mentioned in a recent meeting, in case it can be helpful.

@jmatazzoni Would this data still be useful to Global Collaboration team? I don't have space to work on it right now, but I'd like to know whether it should stay on the backlog.

nshahquinn-wmf lowered the priority of this task from High to Medium.Oct 20 2017, 9:17 PM
Deskana lowered the priority of this task from Medium to Low.Oct 31 2017, 2:54 PM
In T113626#3700720, @Neil_P._Quinn_WMF wrote:

@jmatazzoni Would this data still be useful to Global Collaboration team? I don't have space to work on it right now, but I'd like to know whether it should stay on the backlog.

It sounds like the answer is no :)