Page MenuHomePhabricator

Implement a convenient way to link to ISBNs without magic links
Open, Needs TriagePublic

Description

Since we're removing magic link functionality, we should still have a convenient way to link to ISBNs, so wikis don't have to copy the same basic template over and over again.

A virtual namespace or parser function has been proposed:

  • [[ISBN:0-7475-3269-9]]

or

  • {{#isbn:0-7475-3269-9}}

Leaning to the virtual namespace since it is more expected for a link.

Event Timeline

Change 314478 had a related patch set uploaded (by Legoktm):
Implement ISBN virtual namespace for linking to Special:BookSources

https://gerrit.wikimedia.org/r/314478

I would probably go with a parser function such as {{#isbn:}}/{{#rfc:}} etc in a small extension and then see how many people actually care about it.

Using a namespace to link to a special page seems to deviate a bit from the expected workings and function when people go to do linking and look stuff up.

Yeah, parser functions seem simpler both in terms of code complexity and cognitive load on editors.

Personally, I think the virtual namespace will be confusing. I would favor either a parser function or nothing. Some wikis already have an ISBN template, which seems to be a decent solution. FWIW, I've never used Special:BookSources and always either look for a link from the book title or search Google for the title, which is a much faster way to find it. Clicking through the various links at Special:BookSources seems tedious and archaic.

Will virtual namespace support special:Linksearch?

Will virtual namespace support special:Linksearch?

Neither would - Special:LinkSearch is for external links, and the ISBNs link to Special:Booksources, which is on the same wiki.

Will virtual namespace support special:Linksearch?

Neither would - Special:LinkSearch is for external links, and the ISBNs link to Special:Booksources, which is on the same wiki.

Sorry, I did a mistake. I meant special:whatlinkshere!

I'd be fine with either one, although if a parser function is chosen, I don't think it should include a #. That seems to me to be invoking the idea of the C preprocessor (lke #if and friends), which is not really relevant here.

I like the idea of {{isbn|0-7475-3269-9}} possibly?

  • Not a fan of the # sign
  • Pipe seems more logical than : but I defer to the experts here
  • I use {{OCLC|123456}} all the time. This seems to be similar, assuming?

Is there a consensus as to the placement of the dashes? That drives me bonkers.

Also, will this linking work retroactively to incorporate 10 digit ISBNs?

Apologies if these are tangential to the discussion.

No # means you'd have to deal with all the wikis which have a template by that name.

No # means you'd have to deal with all the wikis which have a template by that name.

I don't think so. The MW parser is smart enough to notice that {{ucfirst|xxx}} and {{ucfirst}} are calls to Template:Ucfirst, while {{ucfirst:xxx}} is a call to the parser function (see https://test.wikipedia.org/wiki/Ucfirst_test). So the same would surely hold true for ISBNs.

One of the benefits of using a template is that it allows for easy validation of ISBNs, the display of error messages for invalid ISBNs, and categorization of pages with invalid ISBNs. Are those features available with parser functions?

If not, I suppose a given WP could create a template that called the parser function and then adopt a guideline that the template is preferred, but that seems like a convoluted way to achieve basic functionality.

Should this be linked somehow to T47942 and T28207 to show some history? I don't know how to do that.

Sorry, I did a mistake. I meant special:whatlinkshere!

The answer is still neither then, as we currently don't track links to special pages.

One of the benefits of using a template is that it allows for easy validation of ISBNs, the display of error messages for invalid ISBNs, and categorization of pages with invalid ISBNs. Are those features available with parser functions?

Yes :) Parser functions are implemented in PHP server-side code, so in general they are more flexible and more featured (if wanted) than templates can.

No # means you'd have to deal with all the wikis which have a template by that name.

I don't think so. The MW parser is smart enough to notice that {{ucfirst|xxx}} and {{ucfirst}} are calls to Template:Ucfirst, while {{ucfirst:xxx}} is a call to the parser function (see https://test.wikipedia.org/wiki/Ucfirst_test). So the same would surely hold true for ISBNs.

Both {{#parserfunction1:}} and {{parserfunction2:}} formats are used depending on the parser function and from what I've seen there's no rhyme or reason (of course, you can review at [[Help:Magic words]]).

more featured (if wanted) than templates can

We have a number of identifier templates (and especially [[Module:Citation/CS1/Identifiers]]) with identifier validation, so I don't know that I agree with "more featured". Is there something we would want for identifiers that a module couldn't provide?

more featured (if wanted) than templates can

We have a number of identifier templates (and especially [[Module:Citation/CS1/Identifiers]]) with identifier validation, so I don't know that I agree with "more featured". Is there something we would want for identifiers that a module couldn't provide?

My "more featured" comment was about parser functions in general, in this case I can't think of anything. The main advantage of a parser function is that it is automatically is available without requiring editors to copy modules/templates around (hence this task!).

My "more featured" comment was about parser functions in general, in this case I can't think of anything. The main advantage of a parser function is that it is automatically is available without requiring editors to copy modules/templates around (hence this task!).

Prior to global templates (which is vaporware IMO :D), yes, that is an advantage.

I would probably go with a parser function such as {{#isbn:}}/{{#rfc:}} etc in a small extension and then see how many people actually care about it.

The approach at T47942#2241781 with e.g. {{#source:pmid|number}} makes the most sense to me. Related is also T12867: Create Special:Sources and {{#sources:}}, also linked therein.

Change 319263 had a related patch set uploaded (by Legoktm):
Add ISBN parser function

https://gerrit.wikimedia.org/r/319263

I would probably go with a parser function such as {{#isbn:}}/{{#rfc:}} etc in a small extension and then see how many people actually care about it.

The approach at T47942#2241781 with e.g. {{#source:pmid|number}} makes the most sense to me. Related is also T12867: Create Special:Sources and {{#sources:}}, also linked therein.

I think creating a generic #source parser function is out of scope for this task.

I don't think a {{#isbn:}} parser function is appropriate for MediaWiki core.

I don't think a {{#isbn:}} parser function is appropriate for MediaWiki core.

First, there's no # in the current patch, it's just plain {{isbn:...}}. But, I think as long as Special:Booksources is in core, a ISBN parser function is acceptable. My proposal was that once magic linking was removed from core, we would move Booksources and the ISBN parser function out of core into a separate extension. But since there was no consenus (yet) on removing magic links from core, we can't be certain about that part yet.

I don't think a {{#isbn:}} parser function is appropriate for MediaWiki core.

First, there's no # in the current patch, it's just plain {{isbn:...}}.

Okay. I thought we'd agreed to stop adding new parser functions that aren't prefixed with #. There's a task about standardizing the rest of them... somewhere. @Danny_B might know where.

But, I think as long as Special:Booksources is in core, a ISBN parser function is acceptable.

We currently use ISBN 1234 syntax and we want to update that syntax to be more explicit and less implicit. If we're going to update all of this plain/magic link syntax, we'll likely want to switch to a template. This is consistent with how we'd treat any other piece of reference markup in an article (cf. https://en.wikipedia.org/wiki/Category:Citation_templates and its subcategories).

Going a bit further, it's my understanding that parser functions are generally prohibited in article wikitext. It's rare to see an {{#if:}} or a {{#switch:}} in an article and the limited instances of parser functions in article wikitext are almost entirely accidental. Parser functions are relegated to the template namespace.

If we end up switching to a template such as {{isbn|1234}}, then for now we're going to have local wiki pages. As Izno notes, global templates don't exist yet. It's only in these local wiki pages that we'll then be using an isbn parser function, right? Why have a parser function? Would/will we add a parser function to MediaWiki core for any other reference identifier?

In the local templates, instead of a parser function, we would (and can already) use [[Special:BookSources/{{{1}}}|ISBN {{{1}}}]] syntax or similar. This takes advantages of all the features templates offer: cache invalidation, usage tracking, a centralized place to update markup across many pages, complex/advanced markup being segregated from regular/common wikitext, etc.

Okay. I thought we'd agreed to stop adding new parser functions that aren't prefixed with #. There's a task about standardizing the rest of them... somewhere.

I finally found T18715: All parser functions and magic words should allow # preceding.

If we're going to update all of this plain/magic link syntax, we'll likely want to switch to a template. This is consistent with how we'd treat any other piece of reference markup in an article (cf. https://en.wikipedia.org/wiki/Category:Citation_templates and its subcategories).

Yes, I imagine big wikis will want to switch to their own custom template. Small wikis will want to continue using built-in MediaWiki functionality, without having to re-implement ISBN validation, the tracking category, etc. themselves.

Going a bit further, it's my understanding that parser functions are generally prohibited in article wikitext.

That's a very enwiki-centric view of the world, isn't it? As I noted above, enwiki will probably want to continue using its existing Template:ISBN.

Going a bit further, it's my understanding that parser functions are generally prohibited in article wikitext.

I don't know about enwiki, but elsewhere DISPLAYTITLE and DEFAULTSORT are widely used.

Yes, I imagine big wikis will want to switch to their own custom template. Small wikis will want to continue using built-in MediaWiki functionality, without having to re-implement ISBN validation, the tracking category, etc. themselves.

In my experience, small wikis want to do what the big wikis do. And they implement this with copy and paste. If big wikis are using a template, small wikis will too.

Why have a tracking category when we already have Special:WhatLinksHere and templatelinks? This seems like a lot of unnecessary work that further ingrains a mistake into MediaWiki.

I don't see a reason to treat ISBNs differently than we would treat any other reference identifier. If you look at a page such as https://en.wikipedia.org/wiki/Template:Citation/identifier#Usage, you can see the practice of using templates is very common.

That's a very enwiki-centric view of the world, isn't it?

Not really. The idea is to keep article wikitext as clean as possible. We obviously fail at this in many ways, but I think it's a worthwhile goal. Do you disagree?

I don't know about enwiki, but elsewhere DISPLAYTITLE and DEFAULTSORT are widely used.

Sure. Perhaps https://en.wikipedia.org/wiki/Template:Italic_title is a better example. We refer to DISPLAYTITLE and DEFAULTSORT as variables, not parser functions in https://www.mediawiki.org/wiki/Help:Magic_words#Statistics, but yes, the two you mention are treated differently than {{#switch}} and friends.

Why have a tracking category when we already have Special:WhatLinksHere and templatelinks? This seems like a lot of unnecessary work that further ingrains a mistake into MediaWiki.

A tracking category for invalid ISBNs. enwiki already has this set up via Scribunto, so like I said, it probably wouldn't want to use this.

I don't see a reason to treat ISBNs differently than we would treat any other reference identifier.

The fact that we have Special:Booksources suggests that we have always treated ISBNs differently. That is a big legacy to cope with.

My long-term hope is that the book source/ISBN stuff can be moved out of core into an extension, which may eventually turn into something like T12867. That way, ISBNs are no longer treated differently. In the short term, moving the ISBN logic into an extension will at least mean that core developers can stop fussing about this one type of reference identifier; I was surprised that there was any opposition to this in the RFC chat on IRC.

Not really. The idea is to keep article wikitext as clean as possible. We obviously fail at this in many ways, but I think it's a worthwhile goal. Do you disagree?

It is indeed a worthwhile goal, but I don't think this magic word/parser function/what have you is going to make wikitext any less clean. Certainly by some measures, the auto-linking of ISBN 123456789X is as pure and clean as one can possibly get, but {{isbn:123456789X}} is the purest and clearest built-in markup-based solution I can possibly imagine.

In my experience, small wikis want to do what the big wikis do. And they implement this with copy and paste. If big wikis are using a template, small wikis will too.

I have to agree. Small wikis tend to emulate whatever English Wikipedia does as far as templates. If we're talking about migrating thousands of links, copying over a template seems like a pretty trivial part of the process, but do we have a good idea of which option English Wikipedia is actually going to prefer:

  1. Implementing an ISBN template
  2. Using an ISBN parser function
  3. Retaining the magic word functionality for ISBNs

Small wikis tend to emulate whatever English Wikipedia does as far as templates.

Mostly because they have never have a choice: if they wanted the functionality they had to copy and paste templates (and modules). Now that we know better, why don't we build this particular feature so that for most purposes you don't even need templates? Wouldn't that be nice?

If we can make sure the ISBN validation and formatting code from the enwiki module is reimplemented as part of this new MW feature, we might even be able to convince enwiki users to drop Template:ISBN, although they're generally very stubborn and will most likely want to keep using the template.

(Snip) we might even be able to convince enwiki users to drop Template:ISBN, although they're generally very stubborn and will most likely want to keep using the template.

The template would probably just become a "meta" template that just calls the parser function (and there is nothing wrong with that)

Small wikis tend to emulate whatever English Wikipedia does as far as templates.

Mostly because they have never have a choice: if they wanted the functionality they had to copy and paste templates (and modules). Now that we know better, why don't we build this particular feature so that for most purposes you don't even need templates? Wouldn't that be nice?

As I mentioned above, even with a parser function many wikis will want to use a wrapper template for other reasons. We should definitely have centralized/global templates, but in the meantime, I don't think making custom parser functions as a replacement for templates is sensible or scalable.

If we can make sure the ISBN validation and formatting code from the enwiki module is reimplemented as part of this new MW feature, we might even be able to convince enwiki users to drop Template:ISBN, although they're generally very stubborn and will most likely want to keep using the template.

For what it's worth, that template is pretty new. There were minor objections to the template existing because it "competes" with magic linking functionality, but I've tried to convince people on the English Wikipedia that templates are better than magic linking for a variety of reasons (e.g., we already have hundreds of similar citation templates) and that magic linking is likely going away.

Some wikis have templates such as https://en.wikipedia.org/wiki/Template:ISBNT, which omit the output of the text "ISBN" for space reasons. Do you think MediaWiki core's new parser function should also support that functionality? With a template, individual communities and template authors can use the flexibility of templates to make their own decisions. With a parser function, extending capability is needlessly more difficult.

I've said all I can really say here. I think a parser function in MediaWiki core is overkill to output a link to Special:BookSources. In most cases, I think most MediaWiki developers would agree. People seem to really want to treat "ISBN" as special. A new custom parser function like this doesn't fit in with MediaWiki's development philosophy and is simply not a great idea for the reasons mentioned in this task.

I If we're talking about migrating thousands of links, copying over a template seems like a pretty trivial part of the process

It never is. For example, en:Template:ISBN and fr:Template:ISBN are two completely different templates (and the latter rather widely used by other wikis).

I If we're talking about migrating thousands of links, copying over a template seems like a pretty trivial part of the process

It never is. For example, en:Template:ISBN and fr:Template:ISBN are two completely different templates (and the latter rather widely used by other wikis).

Usage for en:Template:ISBN: {{ISBN|0-7475-3269-9}}
Usage for fr:Template:ISBN: {{ISBN|1-2345-6789-X}}

Both templates are used for International Standard Book Numbers. There are differences between the two templates, of course, but calling them completely different is pretty weird to me. How do you mean?

The en: template can be used anywhere, the fr: one is for use in references specifically (also it can take multiple ISBN numbers as parameters). Then there is simple: which is a maintenance tag (used on several other wikis as well), ar: which translates the word "ISBN" plus does some RTL/LTR juggling, and cs: which adds an extra CSS class for some script.

German Wikipedia:

  • TL;DR Use template for ISBN, use interwiki format for both RFC and PMID by default.

Reason:

  • Use template for ISBN, since they evaluate to internal link as Special:Booksources.
    • And the ID is long, error-prone, and has a lot of syntactical restrictions and pitfalls.
  • Expect interwiki for RFC and PMID, since they evaluate to external link.
    • There are already interwiki entries for both RFC and PMID ([[PMID:1]] and [[RFC:1776]]).
    • Leave it to local projects and let them decide, whether they want to migrate to a template format. This will be done once when a bot is changing source codes.
    • They might want to expand with a linked [[PubMed]] [[pmid:{{{1}}}]] and check the parameter whether it is a number, perhaps with trailing 2795#section_5.4.7 etc., or leaving out the heading keyword in a list of multiple RFCs.

ISBN issues:

  • This is a complex business.
  • A template could be a simple [[Special:Booksources/{{{1}}}]].
  • German Wikipedia already does hyphenation of ISBN, see Module:URIutil (english doc)
  • Validity check of ISBN is a difficult matter.
    • The number of digits needs to be 10 or 13, and an X or x might be the last digit on length 10, and for 13 digits they need to start with 978 or certain values in 979 range.
    • Even if check digit is invalid the entire ISBN might be valid.
      • If a printer prints an invalid ISBN into a book, the national libraries do register the bad number and answer requests by this number, sometimes by the invalid number only (no auto-correction permitted).
      • See Wikidata Q7611261 templates for invalid numbers.
      • ISBN 0912647160 – LoC has an entry.
      • Even an 11-digit ISBN 3-87472-073-10 is resolved by German national library, and no correct version is assigned.
      • Issuing an error message in certain situations should be left to local project, with appropriate user guidance.

In general I am happy with the target of eliminating magic sources, since it was an unexplainable exception in syntax understanding, also for newbies and scripts. Nobody knew why just these three, no other keyword support. It was an accident in the first years of Wikipedia, when no-one knew how the seed will grow, and should be remedied in the long run.

(Unassigning myself because I'm not actively working on this right now, and it's unclear whether we even want this feature)

Simple way is to re-instate magic links.

This is a very limiting proposal compared to the present functionality. I don't like it, and I don't think it will work for anything but a few special cases.

Note that a link like [[isbn:3-87472-073-10]] can be used if configured. See https://en.wikipedia.org/wiki/Special:Interwiki for more examples.

Not convinced that replacing magic links with yet more magic links is a good way to go about this. Perhaps going around the affected wikis and adding the requisite templates so that people only need to learn to add {{templatename|somenumber}} works better.

German Wikipedia:

  • TL;DR Use template for ISBN, use interwiki format for both RFC and PMID by default.

Actually using templates within flow-text is strictly deprecated in the German WP. You know that. Why do you propose such a solution?

I really dislike the whole idea of removing functionality the editors expect to work. It is a very "technical" approach, "fix it" and leave the actual users with more markup. The editors, the actual users, want to write their stuff without learning a whole lot of markup.

If you're using Visual Editor, it would auto-complete the proper ISBN markup without anyone having to learn it specifically.

@Matthiasb I didn't actually know that -- is that documented on dewiki somewhere? I'd appreciate a pointer for future reference.

I didn't actually know that -- is that documented on dewiki somewhere? I'd appreciate a pointer for future reference.

  • Well, it is simply not the truth.
  • First, there are many templates to be used within main flow text, like template:Bruch or template:Höhe and many more. IIRC once there was something like harvnb or like that in flow text.
  • Second, ISBN are occurring almost only in citations and bibliographic lists. And those are full of templates, e.g. even cite book and cite web in German.

Change 314478 abandoned by Legoktm:
Implement ISBN virtual namespace for linking to Special:BookSources

https://gerrit.wikimedia.org/r/314478

Change 319263 abandoned by Legoktm:
Add ISBN parser function

https://gerrit.wikimedia.org/r/319263

I do not really see the value here. First, I would like to see Special:BookSources moved out of core (e.g., into an extension) not unlike how magic links are likely to be handled anyway (see T252054). And what is wrong with links like [[Special:BookSources/{{{ISBN}}}]]? If you really want something shorter why not just make Special:ISBN be an alias for Special:BookSources (I believe several Special pages have aliases as well as language localization names)? Then ISBN {{{ISBN}}} magic links can just be changed to [[Special:ISBN/{{{ISBN}}}]] style links (e.g., via templates), etc.

So in summary, I would like someone to create an Extension:ISBN that effectively adds the same functionality as Special:BookSources but with a different name and then we can remove Special:BookSources while adding in Extension:ISBN with an alias for Special:BookSources for MediaWiki site that want to keep that functionality (like WMF deployments). And I think magic links should go the same way (although I also believe WMF deployments should work to have this disabled like English Wikipedia already has).

As I had proposed somewhere at the community wishes survey each reference Wikimedia projects are using should be referenced to Wikidata. That would make users capable to use the references metadata input users in an other language version or project in any other Wikimedia project. That would reduce work to maintain references. The information might be entered into Wikidata once and not entered and/or maintained in each and every language version or sister project on its own. Each language version would use the references metadata by something like cite q (in engl. WP) or maybe something smarter in future.

Newer books should be referenced by their ISBN. journal articles by doi, web pages by url, other media by an other unique identifier. There must be a fallback solution for media which do not have their normal standard idientifiers, i.e. books without ISBN; for example: Mr. Gutenberg did not provide an ISBN for his bible.

Having all references' mediadata on Wikidata might enable the community to be faster in detecting and maintaining external rotten links. There could be automatic checks with WorldCat's metadata.