Red interwiki links -- check for page existence across wikis
OpenPublic

Description

Author: xmlizer

Description:
it is important to have information about like we do out of the current
wikimedia instance.

As far as i know, they are on the same database so it *not* technically infeasible

It is especially necessary for wiktionnary


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=37902

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz11.
bzimport created this task.Via LegacyAug 10 2004, 5:19 PM
hashar added a comment.Via ConduitAug 14 2004, 5:49 PM

We have currently no way to know if an article exist on an other wiki. The
easiest choice is to don't show any link.

Moving as an enhancement request.
Doesn't block #17

brion added a comment.Via ConduitAug 27 2004, 1:15 AM
  • Bug 222 has been marked as a duplicate of this bug. ***
Sj added a comment.Via ConduitAug 27 2004, 1:22 AM

Changing summary from "show interwiki and link to wikibooks and wiktionary in
different color if they do not exist", to highlight general nature of the
problem. I think the most common use of such an interwiki check would be to
help correct broken links
to other language-versions of a project, and broken links to/from meta.

For the specific case of the wiktionary links shown on a "this article does not
exist" page, we could keep a list of {title, project} pairs for all extant
wikiprojects in a given language, and only show a "you might want to check
related articles on other projects:"
message when titles *do* exist on other projects.

bzimport added a comment.Via ConduitNov 10 2005, 7:46 AM

brian wrote:

See bug 3917.

bzimport added a comment.Via ConduitNov 10 2005, 7:47 AM

brian wrote:

Also see bug 2463 - if the same article exists on
another wiki, but is in a different language, perhaps
we should automatically translate and display it.

bzimport added a comment.Via ConduitNov 10 2005, 12:11 PM

robchur wrote:

How do you propose to automatically translate the stuff? And how do you know
what article is the same one? Article titles aren't English in non-English
wikis, after all - so how do we set about determining what wikis have our
article? And if we could; what would happen once if we had two or more wikis
with the same one? How could the software tell which to translate?

bzimport added a comment.Via ConduitNov 10 2005, 8:39 PM

wikimedia-bugzilla wrote:

I think the proposal is just to show if there happens to be an article with the
same name in another wiki.

bzimport added a comment.Via ConduitNov 11 2005, 1:52 AM

webmaster wrote:

The proposal as I understand it, pertains to the following code:

[[w:Example_Article]]

Which would be red if Example_Article does NOT exist on Wikipedia
OR
would be blue if Example_Article DOES exist on Wikipedia.

Simply for interwiki linking, no translation at all.

This would cause HUGE cross-site sql quierying and suck up untold amounts of
bandwidth from the sender and receiver. Although this would be FANTASTIC for my
dual-database setup, I think this may be a pipe-dream.

I could possibly see it paired with some sort of caching system which queries
once a week and stores the link-state (red or blue) locally until next updated.
A lot of work, but could encourage a flurry of edits across several sites.

What do you guys think?

bzimport added a comment.Via ConduitNov 11 2005, 2:51 AM

wikimedia-bugzilla wrote:

Once a week is a lot better than nothing. But as you said, a lot of work.

bzimport added a comment.Via ConduitNov 11 2005, 4:51 AM

webmaster wrote:

Upon further thought, even a caching system with weekly/monthly queries would be
a heavy load and would lead to security questions like the ability for one db's
rights to query another (if not all on the same database). An alternate thought
I have is a negative-option link table in the database, which assumes all links
as non-existant red-linked pages until someone clicks it at which time it
queries the other database and if necessary updates the local cache of 'existing
pages'.

Just tossing out ideas here... Again, my ultimate thought here is 'pipe-dream'.
One other scenario could see some sort of function-call passing a boolean result
back in the url of the resulting page on the 2nd database. (Which is uglier than
Steve Buscemi...)

Any one have any other thoughts or should we kill this?

bzimport added a comment.Via ConduitJan 7 2006, 1:28 AM

ssd.wiki wrote:

Another option would be to create an inter-wiki protocol (perhaps http based to
make it simple) that allows one wiki to query another to ask if the page exists,
and then cache it. This poll could be done (as you suggested) only when a user
follows the link. Doing this at the http level would remove the need to break
security and query the other database, at the expense of a small penalty (double
page load -- one for the interwiki query, one for the user's web browser).

bzimport added a comment.Via ConduitSep 18 2007, 10:34 PM

webmaster wrote:

How about a ?action=pageexists function which outputs a raw 'true' or 'false' which could be retrieved via an HTTP GET.
Similar in fashion to the way that Special:Statistics does it:

http://en.wikipedia.org/w/index.php?title=Special:Statistics&action=raw

Should be light on both websites, could be cached for a period and invalidated thereafter.
Even lighter if instead of true/false it were 1/0. ;)
Every byte counts!

Of course, this would be 'off' by default on both sides.
The 'client' wiki would have to turn on $wgDoCrossWikiChecks=TRUE (to do the checking)
The 'server' wiki would have to enable $wgAllowCrossWikiChecks=TRUE (to provide the raw ?action=pageexists output)

If both aren't enabled, it won't work. (Handled gracefully, of course...)
This allows the greatest flexibility and control.

NB: I am changing the summary to reflect the fact that this isn't just for across Wikimedia projects, but rather across any two MediaWiki installations, that support this function.

bzimport added a comment.Via ConduitSep 18 2007, 11:04 PM

robchur wrote:

Previous discussions have favoured some sort of API-based check, although in cases where the foreign database is directly readable (such as in the Wikimedia, Wikia, etc. cases), fetching the information straight out of that is preferable.

Sj added a comment.Via ConduitDec 6 2007, 7:07 AM

It would be quite useful to have an api on the calling wiki's side that says "please update this link with data about the target" if that is possible -- that could even allow for checking on existence or status of a page on an arbitrary site (say, linking to a bugzilla bug, and getting a different display of the linktext based on the bug's status... based on proper use of a similar API on the target site).

bzimport added a comment.Via ConduitDec 23 2007, 2:01 PM

robert wrote:

There appears to be a two way solution to this problem since some interwiki links will be local, and others remote. To adaquetley accomodate both types a mixture of HTTP access and SQL access would have to be involved. To decide which one will be used the easiest solution would be to add two extra columns to the interwiki table, one for the database the wiki is on, and one for that wikis database table prefix (if applicable) - they would be optional.

When an interwiki to a wiki with a database listed is made an SQL query is made to the targets database that will decide whether or not it is red or blue, this could be cached (see explanation below).

If it is a remote wiki then a call to the api could be made (another database column would be required for the path to the api), e.g. http://en.wikipedia.org/w/api.php?action=query&titles=pagename - if it returns <page missing="" /> then it is missing, otherwise it is not. Currently MediaWiki returns 200 status codes for non-existant pages, so just going to the page would not be reliable. Caching would also be essential with this method.

Caching would involve an extra column in the pagelinks table indicating wether a page is a red link or not - this could be periodically updated by a mainteance script or when the page is purged (but not when edited as this could generate too much traffic).

Comments etc are appreciated and I will consider working on this in the new year if a flaw in my solution is not found.

bzimport added a comment.Via ConduitSep 26 2008, 7:01 AM

mattj wrote:

I'm working on this bug at the moment, using a (hopefully) extensible system for both remote (API) and local (DB) sites. Not sure on an ETA, although i'll be merging in changes once sections get done.

bzimport added a comment.Via ConduitJul 22 2011, 1:47 PM

sumanah wrote:

Matt, is your code available for us to look at? Or perhaps you've put this project aside? Per bug 20646 this may depend on the interwiki table.

bzimport added a comment.Via ConduitAug 28 2011, 10:45 PM

mattj wrote:

I have an old implementation of most of this at http://svn.wikimedia.org/svnroot/mediawiki/branches/remotesite/ - I'm happy to bring it up to HEAD and fix in the missing bits if there's still interest, and people think this is the right approach.
(cross-posted to bug 20646 as this would address both bugs)

Kozuch added a comment.Via ConduitDec 30 2011, 3:50 PM

Because of votes rasing importance/priority according to following scheme:
15+ votes - highest
5-15 votes - high
Community must have a voice within development.

Regards, Kozuch
http://en.wikipedia.org/wiki/User:Kozuch

jayvdb added a comment.Via ConduitSep 20 2013, 1:41 AM

In case anyone else runs into this, ...

While {{#ifexists:file:...}} doesnt work for files hosted on Wikimedia Commons..

It is possible to use {{#ifexists:media:...}} on WMF projects, and it does accurately determine whether the media exists on Wikimedia Commons. There is at least one bug: bug 32031 about combining ifexists media: with file redirects.

I havent tested this with InstantCommons.

mxn added a comment.Via ConduitSep 20 2013, 9:47 AM

Yes, it does work on third-party wikis. On the OpenStreetMap Wiki, {{#ifexist: Media:Wikivoyage-logo.svg | yes | no }} returns “yes”.

Solstag added a comment.Via ConduitApr 10 2014, 5:09 PM

So, first this is set to "highest" from some reasonable scheme, then some bot reduces it to "low" without any explanation, and now it's again arbitrarily set to "lowest". What about taking input seriously?

This bug represents an important barrier for collaboration and coordination between wikimedia projects. Thank you.

Qgil added a comment.Via ConduitApr 10 2014, 6:03 PM

Hi Al-Scandar, sorry for not having clarified the action. Note to self: comment always when changing the prioritization of a report.

Bug status, priority, and target milestone fields summarize and reflect reality and do not cause it. This report has been open since 2004, and currently nobody seems to be working or planning to work on it. The "Lowest" just reflects that.

Bug 20646 - Store more target site metadata in interwiki table (which is blocking this report) seems to be in a similar situation, inactive.

On the other hand, it looks like the VisualEditor team is working on the related Bug 37902 - Implement rendering of redlinks and stubs (in a post-processor?)

If the Platform team or someone else wants to include this request in their plan, then they can set priority accordingly.

Withoutaname added a comment.Via ConduitMay 10 2014, 1:10 AM

I'm going to have to suggest another configuration setting (even though MediaWiki is already bloated enough with those already). We need a way for wikis to opt out if they do not need nor want this change.

I'm not sure if this has been suggested, but should we just keep this feature confined to the same wiki farm? Or ask the target wiki if the source wiki wants to check the existence of the target wiki's article? I'd imagine possible abuse like {{#ifexist:google:foo}} when we would not feasibly check if a google page exists or not.

jayvdb added a comment.Via ConduitMay 10 2014, 3:42 AM

I suspect this feature request will be solved/solvable when Wikidata integrates all of the projects, esp. Wiktionary (bug number?)

Then instead of [[w:Blah]] magically being red based on weekly updates, wikis would use a template like {{ifexistson|enwiki}} to call entity:getSitelink( 'enwiki' ) in order to determine whether there is an enwiki page for the local page.

More interesting possibilities are possible once we also have Lua access to any item. (bug 47930).
Then the page [[wikt:ru:Foo]] can do magic relating to [[w:en:Blah]] using calls like {{ifexistson|enwiki|Q527633}}

Jdforrester-WMF added a comment.Via ConduitMay 10 2014, 5:32 AM

(In reply to TeleComNasSprVen from comment #24)

I'm going to have to suggest another configuration setting (even though
MediaWiki is already bloated enough with those already). We need a way for
wikis to opt out if they do not need nor want this change.

I'm not sure if this has been suggested, but should we just keep this
feature confined to the same wiki farm? Or ask the target wiki if the source
wiki wants to check the existence of the target wiki's article? I'd imagine
possible abuse like {{#ifexist:google:foo}} when we would not feasibly check
if a google page exists or not.

I think just doing it for the current wiki farm is a sane approach; later we might do a wider system, but it would need us to significantly change the information held in the interwiki map.

Jdforrester-WMF added a comment.Via ConduitMay 10 2014, 5:40 AM

(In reply to John Mark Vandenberg from comment #25)

I suspect this feature request will be solved/solvable when Wikidata
integrates all of the projects, esp. Wiktionary (bug number?)

No.

This is about skins and appearance, not structural data relating items. (Also, Wikidata isn't remotely the right way to go about this.)

The system we will build as part of the switch from the PHP parser to Parsoid for generating the read HTML will be able to achieve this as an extension of the existing system built for VisualEditor in fixing bug 37901, a pre-cursor to 37902.

VisualEditor requests the existence status of each of the links on the page and sets them to be red or otherwise based on this status; the same styling can be calculated server-side and returned as an API call (without client-side Javascript), which means that this can work for all users, and extending the status checking to other MediaWiki instances in the same farm (or even further afield) is a relatively simple extension of this principle.

Withoutaname added a comment.Via ConduitMay 18 2014, 10:06 AM

(In reply to James Forrester from comment #27)

(In reply to John Mark Vandenberg from comment #25)

VisualEditor requests the existence status of each of the links on the page
and sets them to be red or otherwise based on this status; the same styling
can be calculated server-side and returned as an API call (without
client-side Javascript), which means that this can work for all users, and
extending the status checking to other MediaWiki instances in the same farm
(or even further afield) is a relatively simple extension of this principle.

Can the checks be feasibly done without placing too much load and performance worry on the servers? As someone noted above, even if some of the work was offloaded to cache such querying would already put a strain on the servers.

Jdforrester-WMF added a comment.Via ConduitMay 22 2014, 8:07 PM

(In reply to TeleComNasSprVen from comment #28)

(In reply to James Forrester from comment #27)
> (In reply to John Mark Vandenberg from comment #25)
>
> VisualEditor requests the existence status of each of the links on the page
> and sets them to be red or otherwise based on this status; the same styling
> can be calculated server-side and returned as an API call (without
> client-side Javascript), which means that this can work for all users, and
> extending the status checking to other MediaWiki instances in the same farm
> (or even further afield) is a relatively simple extension of this principle.

Can the checks be feasibly done without placing too much load and
performance worry on the servers? As someone noted above, even if some of
the work was offloaded to cache such querying would already put a strain on
the servers.

Sure; caching the state of the pages is already inside the API cluster's bailiwick, and this would just be a (large) client load on that. It's almost certainly feasible, albeit we may need to bump up the API cluster a little.

Nemo_bis awarded a token.Via WebDec 12 2014, 8:44 AM
Krenair added a subscriber: Krenair.Via WebDec 25 2014, 3:51 PM
Liuxinyu970226 added a subscriber: Liuxinyu970226.Via WebJan 9 2015, 2:40 AM
Liuxinyu970226 removed a subscriber: Liuxinyu970226.Via WebMar 1 2015, 3:27 PM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.