Page MenuHomePhabricator

UserInfoCard: Number of "New articles" incorrect
Open, LowPublic1 Estimated Story PointsBUG REPORT

Description

What is the problem?

I am seeing cases where the number of "New articles" UIC reports is different from what I can see in the user's contributions.

Steps to reproduce problem
  1. Open the UIC for DWalden (WMF) (https://test.wikipedia.org/wiki/Special:ListUsers?username=DWalden+(WMF)&group=&wpsubmit=&wpFormIdentifier=mw-listusers-form&limit=1)
  2. Compare the number in "New articles" (21 the last time I looked) with the number of rows in https://test.wikipedia.org/w/index.php?title=Special:Contributions&target=DWalden%20(WMF)&namespace=all&newOnly=1 (~156)
Environment

Wiki(s): https://test.wikipedia.org CheckUser 2.5 (d42e516) 19:01, 7 July 2025.

Event Timeline

kostajh set the point value for this task to 1.Jul 23 2025, 10:21 AM

This figure refers to number of pages created in the main namespace. In case of DWalden (WMF), there are exactly 21 pages created in the main namespace (+1 due to move) and 21 log entries in the create log – which means that the figure is correct. Here it's important that the message says "new articles" and not "new pages". The latter are only the pages in the main namespace (and some more conditions, which are more expensive to check for historical versions of pages).

However, page creations started to be logged in June 2018, which means that a lot of long-standing users will have their number of articles underestimated. Counting creations based on log was introduced two months ago, after it was found that counting edits takes too long for some accounts.

Additionally, due to the logic in ComputedUserImpactLookup::getCreatedArticleCount, this number is inherently approximate – it skips all articles that aren't within the last 10,000 pages created by the user across all namespaces. This doesn't seem to be a serious problem, as currently there are 190 users on enwiki who created more than 10k pages since June 2018.

To me, it seems that we cannot do better than the current approach.

This figure refers to number of pages created in the main namespace. In case of DWalden (WMF), there are exactly 21 pages created in the main namespace (+1 due to move) and 21 log entries in the create log – which means that the figure is correct. Here it's important that the message says "new articles" and not "new pages". The latter are only the pages in the main namespace (and some more conditions, which are more expensive to check for historical versions of pages).

However, page creations started to be logged in June 2018, which means that a lot of long-standing users will have their number of articles underestimated. Counting creations based on log was introduced two months ago, after it was found that counting edits takes too long for some accounts.

Additionally, due to the logic in ComputedUserImpactLookup::getCreatedArticleCount, this number is inherently approximate – it skips all articles that aren't within the last 10,000 pages created by the user across all namespaces. This doesn't seem to be a serious problem, as currently there are 190 users on enwiki who created more than 10k pages since June 2018.

To me, it seems that we cannot do better than the current approach.

Thanks for your analysis, @mszwarc. I agree with what you've written.

@KColeman-WMF @Niharika what are your thoughts? Could we consider either a design affordance to indicate that the "New articles" count is approximate in some cases? Or maybe we can just add some clarifications to https://www.mediawiki.org/wiki/Product_Safety_and_Integrity/Anti-abuse_signals/User_Info about the data points?

Perhaps we can add something to the UI, like New articles (approx)?

image.png (442×448 px, 62 KB)

Perhaps we can add something to the UI, like New articles (approx)?

image.png (442×448 px, 62 KB)

Hmm. I am not sure about using an abbreviation here, both because the number is often accurate, and also because translating the abbreviation would be difficult.

Could we use a tooltip for the icon or the number that explains more about the method used for calculating? Or can we add a dedicated "info" icon somewhere in the card that, via a tooltip or a click, has an explanation of all the data points and how they're calculated?

I think it should also indicate the year (New articles from 2018), because I was very surprised to see that I created 4 articles :)

image.png (544×504 px, 47 KB)

If this is not specified, users will always ask why the counter is exactly like this.

This figure refers to number of pages created in the main namespace. In case of DWalden (WMF), there are exactly 21 pages created in the main namespace (+1 due to move) and 21 log entries in the create log – which means that the figure is correct.

The UserInfoCard should use a different link for new articles then, it currently uses namespace=all (which obviously shows a much larger number of page creations) while namespace=0 is what users expect when clicking on that link.

@OKryva-WMF this task was marked as stalled, are we OK to work on it again? AFAICT, the change is to update namespace=all to namespace={NS_MAIN}

OKryva-WMF changed the task status from Stalled to In Progress.Sep 1 2025, 5:40 PM

Change #1184045 had a related patch set uploaded (by Harroyo-wmf; author: Harroyo-wmf):

[mediawiki/extensions/CheckUser@master] userinfocard: Make "New articles" links point to the main namespace

https://gerrit.wikimedia.org/r/1184045

Change #1184045 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] userinfocard: Make "New articles" links point to the main namespace

https://gerrit.wikimedia.org/r/1184045

Given the length of the QA column, I think we can skip QA on this.

Im sorry, but reopen. Still wrong count :(

Im sorry, but reopen. Still wrong count :(

Could you provide a recent example of this?

I do see that for DWalden has a count of 21 in the UserInfoCard but seems to have created 22 articles, but not sure if you have other examples to help us narrow this problem down

Im sorry, but reopen. Still wrong count :(

Could you provide a recent example of this?

I do see that for DWalden has a count of 21 in the UserInfoCard but seems to have created 22 articles, but not sure if you have other examples to help us narrow this problem down

Hi, sure :)

image.png (548×410 px, 48 KB)

https://ru.wikipedia.org/w/index.php?title=Служебная:Вклад&namespace=0&newOnly=1&target=Iniquity

Im sorry, but reopen. Still wrong count :(

Could you provide a recent example of this?

I do see that for DWalden has a count of 21 in the UserInfoCard but seems to have created 22 articles, but not sure if you have other examples to help us narrow this problem down

Hi, sure :)

image.png (548×410 px, 48 KB)

https://ru.wikipedia.org/w/index.php?title=Служебная:Вклад&namespace=0&newOnly=1&target=Iniquity

The relevant code is here https://github.com/wikimedia/mediawiki-extensions-GrowthExperiments/blob/b84f22b6ed3411d23ed9d128d511e6923e8c4c5e/includes/UserImpact/ComputedUserImpactLookup.php#L359

We are limited by the number of rows we can query. For users with many log entries, the count will be off. In the short term, I think the best we can do in this scenario is to hide the number "New articles" if the user is known to have more than 10,000. (There's a similar issue with the reverted edit count and line chart, also due to issues with the maximum number of edits we can work with.)

I think for this one, we need to do something like:

  • Check the count of log entries (essentially duplicating this query but without the order by or limit)
  • If the count is > 10,000, add a data point in the API response for userHasMoreThanMaxLogEntries or something like that

In the Vue app, if userHasMoreThanMaxLogEntries is true, then don't show the New articles data point.

@mszwarc also points out that page creations were not logged before July 2018, so we should also hide the "New articles" count if the user account was created before that date.

kostajh claimed this task.
kostajh added a subscriber: hector.arroyo.

On itwiki it was also pointed out that redirects on ns0 would also be counted

Change #1192554 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CheckUser@master] UserInfoCard: Hide new articles count when likely to be inaccurate

https://gerrit.wikimedia.org/r/1192554

The hack (hiding the data point) has been merged, but we'll keep this task open for tracking a better fix.

Change #1192554 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] UserInfoCard: Hide new articles count when likely to be inaccurate

https://gerrit.wikimedia.org/r/1192554

Change #1193700 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CheckUser@wmf/1.45.0-wmf.21] UserInfoCard: Hide new articles count when likely to be inaccurate

https://gerrit.wikimedia.org/r/1193700

Change #1193700 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@wmf/1.45.0-wmf.21] UserInfoCard: Hide new articles count when likely to be inaccurate

https://gerrit.wikimedia.org/r/1193700

Mentioned in SAL (#wikimedia-operations) [2025-10-06T07:20:41Z] <kharlan@deploy2002> Started scap sync-world: Backport for [[gerrit:1193188|Implement AuthPreserveQueryParams for Metrics Platform mpo param (T404622)]], [[gerrit:1193700|UserInfoCard: Hide new articles count when likely to be inaccurate (T399096)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-06T07:26:53Z] <kharlan@deploy2002> kharlan: Backport for [[gerrit:1193188|Implement AuthPreserveQueryParams for Metrics Platform mpo param (T404622)]], [[gerrit:1193700|UserInfoCard: Hide new articles count when likely to be inaccurate (T399096)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-06T07:34:45Z] <kharlan@deploy2002> Finished scap sync-world: Backport for [[gerrit:1193188|Implement AuthPreserveQueryParams for Metrics Platform mpo param (T404622)]], [[gerrit:1193700|UserInfoCard: Hide new articles count when likely to be inaccurate (T399096)]] (duration: 14m 04s)