I want to report an ongoing problem with the Untagged Uncategorized Articles list (http://tools.wmflabs.org/dplbot/cat/untagged_uncats.php) frequently detecting and listing pages which it should not. I've tried to report this several times at en's technical help desk, but it's never been resolved and it's recently been suggested to me that the problem may be at the Toollabs end instead -- data corruption issues in the Toollabs replication of the database, rather than on en itself -- so I'm reporting it here this time.
There are several different versions of this problem:
(1) Pages which are properly categorized, but the last edit in the history is a revert of page-blanking vandalism -- but only if that revert was performed by a bot instead of a human editor. Examples: [[Black Power]]. These can usually be cleared with a null edit, but it would still be preferable if they not show up at all.
(2) Pages which are properly categorized, but the last edit in the history is a page move to a new title. Examples: [[Brain vital signs]], [[Bingham Road railway station]]. [[Andreas Nödl]]. This can sometimes be cleared with a null edit, but other times that fails and I have to go all the way to temporarily deleting and then restoring the article to actually clear it from the list.
(3) Random regurgitated clusters of pages that have already been //deleted//, sometimes //months// earlier; the common element being that at any given time, the pages of this type which appear on the list were always deleted right around the same time as each other. Examples: [[Axiom Landbase Pvt. Ltd.]], [[BatissForever]], [[Bhaiyato The Hobbit]]. These end up being //entirely// impossible to clear from the list -- restoring doesn't work, redeleting doesn't work, creating a placeholder page filed in the "Temporary maintenance holdings" category doesn't work -- and I end up having to just work around them as permanent speed bumps on the list until they somehow decide to clear on their own. Occasionally, I've even had to ask JaGa to hardcode such titles directly into the bot programming as specific exclusions -- but he hasn't been around much lately, so that can't be the permanent answer to this.
(4) Random clusters of recently created articles; again, the common element being that at any given time, the pages of this type which appear on the list were always created right around the same time. This error also has an extremely odd tendency to hit soccer/football players and plant or animal species far more often than any other type of article. Examples: [[Antaeotricha ogmosaris]]. As in #2, these vary in whether a null edit will clear them, or whether I have to go to a full-on delete/restore.
(5) Random clusters of former articles which have been converted into redirects. Examples: [[Magic Lamp]], [[Magic lamp]], [[Magical lamp]]. Sometimes, but not always, a delete-restore will clear them; nothing else will.
(6) Soft redirects to Wiktionary, where an editor has tried to convert the redirect into a DICDEF article but then another editor has reverted it back to a soft redirect to Wiktionary again: for some reason, the uncats list loses the ability to bypass them as it normally does with soft redirects, but now considers them to be full articles. The only solution that has ever worked in this case was to add them to the "Temporary maintenance holdings" category. I haven't seen any examples of this yet in the current batch, although most of the current contents of "Temporary maintenance holdings" are prior examples of it.
(7) Random clusters of longstanding articles where I can't figure out any discernible reason at all for the error; the common element in this case is that when this happens, the articles involved are all in the same category as each other. Examples: [[Zorlovići]], [[Čardak, Pljevlja]], [[Čavanj]], [[Čerjenci]], [[Čestin, Montenegro]], [[Đuli]], [[Đurđevića Tara]] and [[Ljuće]], all of which are and have always been properly filed in the category "Populated places in Pljevlja Municipality" (which also contains many other pages that aren't being detected as uncategorized, so the category itself isn't the problem.) It's worth noting that the last time I can recall seeing this, it also involved populated places in a Slavic-language country (although it was Russia that time rather than Montenegro), although it has at times hit other categories as well.
It is really frustrating to constantly have to deal with all of these, because they end up sucking up unnecessary amounts of time and energy. I //should// be able to power through a tagging batch in 20-30 minutes at most, but these issues invariably turn it into a two-to-three-//hour// job because I have to stop and investigate and null-edit or delete-restore pages I shouldn't even be seeing on the list at all. And, in fact, I //should// be able to just let a bot loose on the list and not actually have to devote //any// of my own time and energy to tedious tasks like this at all -- but as long as errors like these are polluting the list, I can't.
I'd really appreciate it if somebody could actually figure out how to fix this finally. Thanks.