Missing file (red link) tracking category support
Closed, ResolvedPublic

Description

I was chatting in #mediawiki about the best method for adding a hidden tracking category to pages in order assist in identifying and tracking pages that embed missing/deleted files. http://en.wikipedia.org/w/index.php?title=Counties_of_Uganda&oldid=340942829 is one example of a page which was badly in need to cleanup. Adding built in support for this type of tracking would make maintaining content a lot easier. Platonides suggested implementing this as part of the parser, which makes perfect sense since its used at every purge/refresh/save process


Version: unspecified
Severity: enhancement

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz23816.
Betacommand created this task.Via LegacyJun 6 2010, 9:00 PM
bzimport added a comment.Via ConduitJun 6 2010, 10:58 PM

conrad.irwin wrote:

There's [[Special:WantedFiles]] but it seems to be horribly broken at the moment - I imagine that's due to the difficulty of keeping the information up to date automatically with remote file repos

Betacommand added a comment.Via ConduitJun 7 2010, 12:26 AM

That uses a database query which does not scale well with large wikis, and it tracks files not the articles where they are used. Its a lot easier with a list if pages to fix, than it is to track down each files usage

Nemo_bis added a comment.Via ConduitApr 19 2011, 9:00 PM

(In reply to comment #1)

There's [[Special:WantedFiles]] but it seems to be horribly broken at the
moment - I imagine that's due to the difficulty of keeping the information up
to date automatically with remote file repos

It's bug 6220.

MarkAHershberger added a comment.Via ConduitApr 19 2011, 9:14 PM

I think bug #16112 would take care of this, but I'm not sure. Putting it as a blocker on this.

Betacommand added a comment.Via ConduitApr 19 2011, 9:18 PM

Not really, fixing bug 16112 would help making the missing files special page easier to work with, but this request serves a different purpose, I find these types of tracking categories (similar to what happens with Special:Cite errors) makes fixing this easier to fix. ~~~~

Bawolff added a comment.Via ConduitApr 20 2011, 5:05 AM

Created attachment 8432
patch to add a tracking category for broken images.

This is fairly trivial to do if we really want to add a tracking category for this.

So do we really want to add a tracking category for this sort of thing?

Attached: broken-image.diff

Bawolff added a comment.Via ConduitApr 20 2011, 5:13 AM

(In reply to comment #6)

Created attachment 8432 [details]
patch to add a tracking category for broken images.

This is fairly trivial to do if we really want to add a tracking category for
this.

So do we really want to add a tracking category for this sort of thing?

btw, just to clarify - I can commit said patch, I'm just not sure if this is a bug we want to fix, or if its a wontfix type thing due to duplication (with the rather borked) special:wantedfiles.

Attached: broken-image.diff

Peachey88 added a comment.Via ConduitApr 20 2011, 5:31 AM

(In reply to comment #7)

btw, just to clarify - I can commit said patch, I'm just not sure if this is a
bug we want to fix, or if its a wontfix type thing due to duplication (with the
rather borked) special:wantedfiles.

Perhaps we could do this, and then also have wantedpages display the results of the category (basically fixing two things in one then.... because a category would be easier if any bots wanted to access it I would assume).

MarkAHershberger added a comment.Via ConduitApr 20 2011, 12:13 PM

(In reply to comment #7)

btw, just to clarify - I can commit said patch, I'm just not sure if this is a
bug we want to fix, or if its a wontfix type thing due to duplication (with the
rather borked) special:wantedfiles.

Would this have less brokenness than special:wantedfiles? Or would it be similar?

If I understand this (no guarantee that I do), then a hidden category is going to have more consistency than the clean-up that WantedFiles relies on.

Peachey88 added a comment.Via ConduitApr 20 2011, 12:17 PM

(In reply to comment #9)

Would this have less brokenness than special:wantedfiles? Or would it be
similar?

(Without looking at how the patch works) Well if a remote file existed it shouldn't produce a red link so it shouldn't add it to the category.

MarkAHershberger added a comment.Via ConduitApr 20 2011, 2:46 PM

(In reply to comment #10)

Well if a remote file existed it shouldn't produce a red link so it shouldn't
add it to the category.

So, then, that means "less brokenness", right?

Betacommand added a comment.Via ConduitApr 20 2011, 3:11 PM

Not being able to read php I would assume that this modifies the parser or similar component so that during parsing of the page it checks to see if a file being used exists (both locally and remotely) and if it does not exists it inserts a tracking category. this would solve the brokenness of wanted files because it does not use a single query (which causes too much stress on large projects) and is constantly updated when a page is re-parsed. (I assume that this is being added via a mediawiki namespace message would give finer grain control of which pages get added to it so that different namespaces could be given different categories)

Bawolff added a comment.Via ConduitApr 20 2011, 7:45 PM

ok, fixed in r86534.

Note this is slightly different from special:wantedfiles, since wantedfiles is ordered by how many broken links their are to the file.

Betacommand added a comment.Via ConduitJul 18 2011, 2:51 PM

Fix was reverted in trunk

Reedy added a comment.Via ConduitJul 18 2011, 2:52 PM

(In reply to comment #14)

Fix was reverted in trunk

It's been marked fixme, not reverted...

Bawolff added a comment.Via ConduitAug 25 2011, 1:49 AM

(In reply to comment #15)

(In reply to comment #14)
> Fix was reverted in trunk

It's been marked fixme, not reverted...

And the issues have now been resolved, so remarking this bug as fixed.

cheers.

G.Hagedorn added a comment.Via ConduitDec 31 2011, 4:06 PM

I think this needs to be better documented. Is it already somewhere where I overlooked it? Please add some documentation on mediawiki.org - I am not sure where, but

Release notes only mentions:

  • (bug 23816) A tracking category is now added for any pages with broken images.

I figured out that the attachment uses Mediawiki:broken-file-category which needs to be executed on local wiki to find the local string. Some results:

http://www.mediawiki.org/wiki/Category:Pages_with_broken_file_links
in German:
http://www.mediawiki.org/wiki/Kategorie:Seiten mit defekten Dateilinks
in German Wikipedia changed to:
http://de.wikipedia.org/wiki/Kategorie:Wikipedia:Defekter_Dateilink

Nemo_bis added a comment.Via ConduitDec 31 2011, 4:29 PM

(In reply to comment #17)

I think this needs to be better documented. Is it already somewhere where I
overlooked it? Please add some documentation on mediawiki.org - I am not sure
where, but

AFAIK tracking categories are not documented on mww (or anywhere else, actually), they're left to local categorization. Which is not very good, but that's bug 1 I guess.

I figured out that the attachment uses Mediawiki:broken-file-category which
needs to be executed on local wiki to find the local string.

Not really, you just have to follow the usual system: [[:translatewiki:Special:PrefixIndex/MediaWiki:Broken-file-category/]].

G.Hagedorn added a comment.Via ConduitDec 31 2011, 5:10 PM

Not really, you just have to follow the usual system:
[[:translatewiki:Special:PrefixIndex/MediaWiki:Broken-file-category/]].

(translatewiki interwiki link is not a feature of mediawiki, although present on WMF sites).

Yes, rather than checking only your own translation, you can find all applicable translation under:

http://translatewiki.net/w/i.php?title=Special%3APrefixIndex&prefix=Broken-file-category%2F&namespace=8

However, if logged in, you need to look under the "native" language of the wiki, which may differ from the language in which the Wiki user interface is displayed.

AFAIK tracking categories are not documented on mww (or anywhere else,

actually), Which is not very good, but that's bug 1 I guess.

I agree; created Bug 33448.

Krinkle added a comment.Via ConduitDec 31 2011, 6:04 PM

(In reply to comment #8)

(In reply to comment #7)
> btw, just to clarify - I can commit said patch, I'm just not sure if this is a
> bug we want to fix, or if its a wontfix type thing due to duplication (with the
> rather borked) special:wantedfiles.
Perhaps we could do this, and then also have wantedpages display the results of
the category (basically fixing two things in one then.... because a category
would be easier if any bots wanted to access it I would assume).

No, bots can access querypages just as easy as categorymembers. Both have APIs:

Bawolff added a comment.Via ConduitDec 31 2011, 11:22 PM

(In reply to comment #18)

(In reply to comment #17)
> I think this needs to be better documented. Is it already somewhere where I
> overlooked it? Please add some documentation on mediawiki.org - I am not sure
> where, but

AFAIK tracking categories are not documented on mww (or anywhere else,
actually), they're left to local categorization. Which is not very good, but
that's bug 1 I guess.

I've added [[mw:Help:Tracking_categories]] which should hopefully help somewhat

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.