Page MenuHomePhabricator

Display Number of Pageviews in New Pages Feed
Open, Needs TriagePublic5 Estimated Story Points

Description

As a Page Curation user, I want to see the median number of article page views, so that I can better understand the popularity of a given article.

Acceptance Criteria:

  • Display the median number of page views (per day) for an article within the article record in the New Pages Feed
  • The median should be calculated within a 30 day range. If the article was created less than 30 days ago, it should be calculated from the date of creation
  • The number should be displayed next to "Categories" in the article record in the following way: "[X] views per day"
  • The font, style, and size should be the same as other text in the section (such as number of edits or bytes)
  • The number should be updated every 24 hours

Visual Examples:

npp_pageviewsexample.png (216×2 px, 94 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ifried set the point value for this task to 5.Aug 15 2019, 6:31 PM
ifried renamed this task from Placeholder: Display the number of pageviews in the article record to Display the number of pageviews in the article record.Aug 22 2019, 10:44 PM
ifried updated the task description. (Show Details)
ifried added a subscriber: Prtksxna.

@Prtksxna: We're considering doing this work. Where do you think it would be appropriate to display the number of page views (which will most likely be average or median per day) within the article record of the New Pages Feed? Thanks!

ifried renamed this task from Display the number of pageviews in the article record to Display Number of Pageviews in New Pages Feed.Aug 22 2019, 10:56 PM

Will the average (or median) be calculated since the page was created? Or will it be the average of that week or something?

If it is since the beginning I think it'd make sense to put it with the number of edits (since that already has the beginning date). Otherwise we could put it next to the categories.

@Prtksxna We would do the average or median in the past 30 days. If the article was created less than 30 days ago, we would calculate the average/median based on the number of days since the article creation. In terms of placement, I also think that it could come next to "Categories" (perhaps right afterward) in the following manner: "X page views per day (average)" or "X page views per day (median)," depending on what we choose. What do you think?

Also, I'm asking around on this question, so I'm curious to hear your opinion too: Do you have any thoughts on using the average vs. the median (i.e. which one may be better)?

@Prtksxna We would do the average or median in the past 30 days. If the article was created less than 30 days ago, we would calculate the average/median based on the number of days since the article creation. In terms of placement, I also think that it could come next to "Categories" (perhaps right afterward) in the following manner: "X page views per day (average)" or "X page views per day (median)," depending on what we choose. What do you think?

Yep, if it is not the total since the creation of the page then it being next to categories makes sense:

Screenshot 2019-08-27 at 10.14.18 AM.png (216×2 px, 94 KB)

Also, I'm asking around on this question, so I'm curious to hear your opinion too: Do you have any thoughts on using the average vs. the median (i.e. which one may be better)?

Looking at the data for some new pages it seems that the distribution is quite skewed - https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=this-month&pages=Taska|Sevi|List_of_Lebanese_Australians|Qincheng|KYAR|Slonina|Manikganj

Screenshot 2019-08-27 at 10.28.00 AM.png (1×2 px, 1 MB)

Where the mean is 27, the median is 18. In such a case I think the median is a better measure for us. This was just a random test with a few pages, we could look into this more closely too 🤔

ifried updated the task description. (Show Details)

@Prtksxna I'm also inclined to think that median is a more useful number. Meanwhile, the mean/average is more likely to have misleading numbers. Thanks for the analysis. I'll reach out to some community members below to see what they think as well.
@Insertcleverphrasehere We discussed this work on Meta-Wiki, but I would love for you take a look at the requirements (listed in this ticket) and see if they look good to you. With this change, the user would not be able to filter or sort by page views, but they would see the number of page views in the article record. Would this be helpful to reviewers? And do you think the median is more useful than the mean/average? Thanks!
@Barkeep49 I know you mentioned on Meta-Wiki that you weren't sure if this work was useful. However, since you participated in the discussion, I would love to hear your feedback as well (now that we have ironed out more requirements and can present a basic mockup). Do you think that this listing would be useful, in its current proposed form? And, if so, do you think the median is a more useful number than the mean/average? Thanks!

ifried updated the task description. (Show Details)

I think it is useful, I just don't know if it is going to be useful for the time it takes to do. However, some chunk of that seems like it's already been done. But beyond all that median seems like the right gauge for me. I assume the number will be calculated correctly so that if an article is 5 days old the other 25 days are null not 0.

@Barkeep49 Thanks for the feedback! We're proposing this alternative because we don't think it will take a very long time, especially compared to the original request. And, yes, the calculation of the median would depend on when the article was created. If the article was created less than 30 days ago, the calculation would be based on the date of creation rather than from 30 days total.

I'm not sure that this is useful at all. It's not something that will help me with my patrolling. Maybe I'm missing something, but a new page, unpatrolled and thus non indexed, shouldn't have any page views except by its creator and/or reviewers.

I am also concerned that every new piece of page meta information we now start adding to the entries in the feed will not only introduce clutter to add to a patroller's bewilderment - I'm thinking here in terms related to Banner Blindness, a phenomenon well researched by information scientists, well before the advent of the Internet.

The addition of many more snippets of information may also lead to a slowing down of the loading/rendering of the feed.

This proposal failed to reached consensus, so we'll leave it as an open ticket, if things change at a later date and another team would like to take it on. I'm removing the Community Tech tag from this ticket, as we've now wrapped up the Page Curation Improvements project. More details on the project and its final outcomes can be found on the Page Curation Improvements project page. Thanks!