[RfC] Future of magic links
Open, Needs TriagePublic

Description

The RFC proposes three steps:

  1. Disable the magic link functionality by default for the MediaWiki 1.28 release, and mark it as deprecated. (approved in E287)
  2. Deprecate magic links on Wikimedia wikis (e.g. Wikipedia), providing alternatives for this functionality and tools to aid the migration. We agreed to start building these tools in E287.
  3. Disable magic links functionality a year or so after the MediaWiki 1.28 release (in time for the next MediaWiki LTS release)

Full details https://www.mediawiki.org/wiki/Requests_for_comment/Future_of_magic_links

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 14 2016, 5:25 AM
JJMC89 added a subscriber: JJMC89.Sep 14 2016, 6:08 AM
binbot added a subscriber: binbot.Sep 14 2016, 7:41 AM

Marking this as a 1.28 release blocker as I've proposed we change the default functionality before this release goes out.

RobLa-WMF moved this task from Inbox to Request IRC meeting on the TechCom-RfC board.
RobLa-WMF added a subscriber: RobLa-WMF.

@tstarling volunteered to shepherd this one last week in E272. We think this might even be ripe for our IRC discussion next week (at E287)

We think this might even be ripe for our IRC discussion next week (at E287)

Thanks, that would be great. Just a heads up that I'm back in school now, and am unable to make the Wednesday meeting times due to class :( I'm fine with this being discussed in my absence or I'm also available Tuesday/Thurs at the same time.

We think this might even be ripe for our IRC discussion next week (at E287)

Thanks, that would be great. Just a heads up that I'm back in school now, and am unable to make the Wednesday meeting times due to class :( I'm fine with this being discussed in my absence or I'm also available Tuesday/Thurs at the same time.

Ooops, thanks for the reminder about that. I kept the usual time for Wednesday, but I think there are some asynchronous discussions we can try to have. Let's try to spin up the conversation on wikitech-l about this topic.

Legoktm added a subscriber: cscott.Oct 4 2016, 11:31 PM

Ooops, thanks for the reminder about that. I kept the usual time for Wednesday, but I think there are some asynchronous discussions we can try to have. Let's try to spin up the conversation on wikitech-l about this topic.

@cscott will be representing me/the RfC during the meeting tomorrow. :)

Elitre added a subscriber: Elitre.Oct 5 2016, 7:21 AM
cscott added a comment.Oct 5 2016, 3:54 PM

An alternative to the {{ISBN|xxxx}} parser function (or global template) would be to add isbn to the interwiki table. Then [[isbn:0-7475-3269-9]] could resolve to Special:BookSources on the local wiki. This wouldn't require any new code (or parser functions), just a table update. On the other hand, the interwiki pattern would have to be slightly different on each wiki in order to ensure it always resolves to the wiki itself. (I wonder if a relative URL would work in the interwiki table?)

RobLa-WMF updated the task description. (Show Details)Oct 5 2016, 5:27 PM
Legoktm updated the task description. (Show Details)Oct 5 2016, 6:19 PM

Discussed at E287: ArchCom RFC Meeting W40: Future of magic links. (2016-10-05, #wikimedia-office). Important bits of the summary:

  • AGREED: 1. Disable the magic link functionality by default for the MediaWiki 1.28 release (TimStarling, 21:56:26)
  • AGREED: 2. Deprecate magic links on Wikimedia wikis (e.g. Wikipedia), providing alternatives for this functionality and tools to aid the migration (TimStarling, 21:56:58)
    • note: some nuance around this agreement was discussed in the full log below (lines 202-208)
  • no consensus on removing the functionality from MW or flagging its pending removal via deprecation (TimStarling, 21:58:35)

Log of the meeting:

1​21:01:02 <TimStarling> #startmeeting RFC meeting
2​21:01:02 <wm-labs-meetbot> Meeting started Wed Oct 5 21:01:02 2016 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot.
3​21:01:02 <wm-labs-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
4​21:01:02 <wm-labs-meetbot> The meeting name has been set to 'rfc_meeting'
5​21:01:05 <cscott> good afternoon
6​21:01:18 <TimStarling> #topic RFC: Future of magic links | Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) |​ Logs: https://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
7​21:01:54 <robla> #link https://phabricator.wikimedia.org/T145604
8​21:02:08 <robla> #link https://www.mediawiki.org/wiki/Requests_for_comment/Future_of_magic_links
9​21:02:49 <robla> hi everyone! hi legokt^H^H^H^H^H^Hcscott :-)
10​21:03:32 <cscott> i will be serving as your legokt today
11​21:04:28 <robla> there are three steps we discussed that are summarized in the description of T145604
12​21:04:29 <stashbot> T145604: [RfC] Future of magic links - https://phabricator.wikimedia.org/T145604
13​21:05:04 <Scott_WUaS> Hi
14​21:05:12 <robla> step 1: Disable the magic link functionality by default for the MediaWiki 1.28 release, and mark it as deprecated.
15​21:05:30 <robla> 2. Deprecate magic links on Wikimedia wikis (e.g. Wikipedia), providing alternatives for this functionality and tools to aid the migration
16​21:05:30 <robla> 3. Disable magic links functionality a year or so after the MediaWiki 1.28 release (in time for the next MediaWiki LTS release)
17​21:05:50 <robla> is there anyone that objects to step 1?
18​21:06:34 <TimStarling> is legoktm online?
19​21:06:43 <cscott> maybe we should find out who's here? (that is, actively participating in this discussion.)
20​21:07:03 <cscott> TimStarling: no, legoktm is in class, so I'm being his surrogate for this meeting.
21​21:07:16 <robla> legoktm is at a class right now, so if there are objections or things he needs to clarify, we'll need to plan a followup
22​21:08:02 <Scott_WUaS> (Hi CScott - I'm reading along, typing occasionally)
23​21:08:05 <robla> it seems, though, that this is a "yeah, we should just do this" kind of thing for step 1
24​21:08:08 * subbu is around and is generally happy with the direction, modulo details that might need tweaking
25​21:08:31 <TimStarling> is there going to be an extension to replace it?
26​21:08:39 <bd808> step 1 is normal deprecation procedure I think
27​21:09:02 <brion> a workalike extension could be created if someone wants one but i don't know if anyone does :)
28​21:09:08 <cscott> i think there's still a semi-open question w/ what the replacement should be: 1) ordinary template, 2) parser function, 3) extension, 4) interwiki link, 5) something else?
29​21:09:13 <TimStarling> if there's no migration path for external users then I don't think it should be deprecated
30​21:09:13 <bd808> Kunal wrote something about that in email/phab today...
31​21:09:21 <TimStarling> just disabled by default
32​21:09:50 <brion> my inclination is ordinary template is the right thing for wikipedia
33​21:09:54 <cscott> 1) and 2) look like {{ISBN|xxx}} and require content edits, 3) would be a workalike using a parser hook, 4) would look like [[isbn:xxx]] and require content edits.
34​21:09:58 <robla> legoktm's position is that disabling just disables the hyperlinks, but it's still readable
35​21:10:27 <brion> it is a relatively clean 'failure mode' yes
36​21:10:38 <brion> no explosions/php fatals :)
37​21:10:40 <cscott> i think we already have a config option for disabling it in the parser? legoktm wrote that patch. so it's just a matter of turning that config option off on WMF wikis.
38​21:11:27 <cscott> we can discuss removing the feature from the codebase and/or moving it to a parser extension as a separate thing.
39​21:11:31 <subbu> I don't like option (3)
40​21:11:38 <bd808> TimStarling: "We would move the Booksources code and ISBN parser function to an
41​21:11:39 <bd808> a extension." -- https://lists.wikimedia.org/pipermail/wikitech-l/2016-October/086716.html
42​21:12:11 <subbu> i.e. option (3) for parsing "ISBN ....." as a magic link with a parser hook
43​21:14:15 <cscott> i don't like it for WMF wikis, but it would be a reasonable thing to allow 3rd parties to continue to support magic links on their external wikis if they felt strongly about it
44​21:14:21 <TimStarling> using square brackets seems elegant to me
45​21:14:27 <cscott> re TimStarling's "if there's no migration path for external users then I don't think it should be deprecated"
46​21:14:32 <TimStarling> considering it is an actual link
47​21:15:03 <subbu> I can go with either {{isbn|..}} or [[isbn:..]] personally
48​21:15:04 <bd808> pmid and rfc are proposed to become default interwiki links
49​21:15:24 <bd808> isbn is the only oddball right?
50​21:15:27 <cscott> [[rfc:1234]] would work fine
51​21:15:37 <cscott> isbn is the oddball because it goes to *the wiki itself*
52​21:15:42 <bd808> *nod*
53​21:15:54 <TimStarling> yeah, it would be like a namespace alias
54​21:15:55 <cscott> although i don't think there's anything preventing interwiki links from being relative URLs?
55​21:15:58 <TimStarling> like media
56​21:16:29 <TimStarling> it could even be a negative namespace ID rather than an interwiki prefix
57​21:17:15 <brion> interesting idea
58​21:17:37 <brion> i think i prefer an interwiki, with special:booksources as a special service that happens to be the link target
59​21:18:27 <robla> is there a precedent for magic link functionality in other wiki markups? e.g. does anyone else automatically turn "ISBN \d+" into a hyperlink automatically?
60​21:18:36 <brion> but negative (magic) namespace has a certain elegance to it too
61​21:19:11 <cscott> robla: external link hyperlinking is the precedent
62​21:19:16 <cscott> ie, http://foo.bar/bat
63​21:19:34 <cscott> most other markup languages have that
64​21:19:43 <brion> bugzilla autolinks textual things like 'bug 1234'
65​21:19:49 <cscott> when given a bare url
66​21:19:57 <brion> and of course many mobile browsers autolink phone numbers (which i hate ;)
67​21:20:00 <cscott> github markdown autolinks hashes and bug references
68​21:20:25 <TimStarling> $RFCPattern = "RFC\\s?(\\d+)";
69​21:20:25 <TimStarling> $ISBNPattern = "ISBN:?([0-9- xX]{10,})";
70​21:20:32 <TimStarling> s/$RFCPattern/&StoreRFC($1)/geo;
71​21:20:38 <TimStarling> s/$ISBNPattern/&StoreISBN($1)/geo;
72​21:20:39 <cscott> https://help.github.com/articles/autolinked-references-and-urls/
73​21:20:40 <TimStarling> says usemod
74​21:20:49 <TimStarling> so this feature was in fact carried over from usemodwiki
75​21:21:08 <cscott> PMID was a more recent addition
76​21:21:47 <robla> that would imply a precedent that we're continuing rather than merely an oddball legacy feature of our own
77​21:22:17 <cscott> weeelll.
78​21:22:30 <robla> that said, I think I agree with brion's hatred of autolinked phone numbers
79​21:22:38 <cscott> precent becomes oddball legacy given enough time (and lack of external adoption)
80​21:22:45 <TimStarling> usemod is also responsible for the "free link" terminology
81​21:23:25 <robla> it's quite likely just an oddball legacy feature that we inherited from grandpa usemod :-)
82​21:24:45 <TimStarling> the argument for deprecation would be simpler parser code?
83​21:25:27 <cscott> simple parser spec & removal of an english-specific bit of markup
84​21:25:42 <TimStarling> but as long as we magically link free URLs, we will still have Parser::doMagicLinks()
85​21:25:48 <cscott> RFCs in particular are english-only and not really relevant for most of our projects
86​21:26:15 <TimStarling> that's the argument for step 2
87​21:26:16 <James_F> TimStarling: And the magic of image links -> remote transclusion when that evil flag is enabled.
88​21:26:20 <TimStarling> I mean the argument for step 1
89​21:26:23 <cscott> sure, but doMagicLinks() could be greatly simplified and only match https?: prefixes
90​21:26:30 <cscott> for example
91​21:26:39 <cscott> instead of the pretty complicated regexp for ISBNs
92​21:26:40 <TimStarling> I am fine with step 2 and 3 but not so much with step 1
93​21:27:02 <cscott> and our free URLs autolinking is pretty crazy complicated because of all the different protocol schemes we support in theory.
94​21:27:30 <subbu> I don't like "magic links" because they suddenly add special meaning to some text substrings ...
95​21:27:59 <cscott> it complicates WTS as well, you have to watch for all sorts of boundary conditions when serializing
96​21:28:01 <TimStarling> if it's disabled on WMF wikis then parsoid doesn't need to support it
97​21:28:19 <James_F> Parsoid is for more than just WMF users, eventually.
98​21:28:22 <subbu> instead of using uniform syntax .. and as someone argued, RFC xyz is ocfusing since mediawiki has RFCs.
99​21:28:25 <bd808> o_0 parsoid is Wikimedia only TimStarling?
100​21:28:29 <subbu> *confusing
101​21:28:44 <cscott> and if we're going to continue to do round-trips for content migration, whether to wikitext 2.0 or markdown or <fill in the blank>, then simplifying WTS (wikitext serialization) will continue to be desirable
102​21:29:02 <TimStarling> parsoid has a reduced feature set compared to MW
103​21:29:04 <robla> it seems like protocol links (like mailto:.*, http?s:.*) have ample precedent in both email and wikis. Magic words as arbitrary barewords are relatively rare
104​21:29:05 <subbu> and all the nowiki crap.
105​21:29:09 <TimStarling> it doesn't even support wikipedia properly
106​21:29:24 <bd808> well that I can't argue against
107​21:29:38 <cscott> robla: yes, but have you looked at the full list of protocols we support?
108​21:29:54 <TimStarling> there are no plans to make parsoid support every single MW parser extension
109​21:30:01 <robla> cscott: not in a while
110​21:30:05 * robla goes to look
111​21:30:22 <cscott> $wgUrlProtocols = [ 'bitcoin:', 'ftp://', 'ftps://', 'geo:', 'git://', 'gopher://', 'http:// ', 'https://', 'irc://', 'ircs://', 'magnet:', 'mailto:', 'mms://', 'news:' , 'nntp://', 'redis://', 'sftp://', 'sip:', 'sips:', 'sms:', 'ssh://', 'svn://', 'tel:', 'telnet://', 'urn:', 'worldwind://', 'xmpp:', '//' ];
112​21:30:27 <subbu> TimStarling, yes .. i've proposed that parser hooks that rely on parser internals should instead be deprecated.
113​21:30:46 <subbu> parsoid and php parser have different internals.
114​21:30:47 <cscott> i'm pretty sure we could drop gopher:// and worldwind://
115​21:31:05 <bd808> I hear gopher is going to make a comeback ;)
116​21:31:25 <TimStarling> free external links could support a reduced set of protocols compared to bracketed external links
117​21:31:27 <cscott> but in general, I like {{bitcoin|<hash>}} or [[bitcoin:hash]] better than autolinking bitcoin:cafebabe in text
118​21:31:42 <subbu> but, anyway ... we do want to get to a unified model .. and that parsoid doesn't support wikipedias is neither here nor there .. whatever parser it is that supports wikipedias will need to grapple with the roundtripping and html2wt issue.
119​21:31:45 <TimStarling> that's why we have so many protocols, because non-WMF users want to link to those things
120​21:32:25 * robla doesn't think the list cscott provided seems so crazy
121​21:32:29 <cscott> TimStarling: perhaps, but I think they could manage to use slightly different markup to accomplish the same task.
122​21:32:55 <TimStarling> like bracketed external links?
123​21:32:59 <cscott> exactly
124​21:33:23 <cscott> or double-bracketed links for some things perhaps
125​21:33:26 <TimStarling> [[Gopher (protocol)]] does of course have many gopher links on it
126​21:33:29 <Scott_WUaS> robla: agreed - cscott's list could make sense
127​21:33:40 <TimStarling> but all bracketed as far as I can see
128​21:33:50 <marktraceur> cscott: Pretty sure gopher:// is officially deprecated now, from what I vaguely recall
129​21:35:55 <cscott> so i'm going to be a poor advocate for legoktm and say I'm not particularly chuffed by magic links. i'd get rid of them for wikitext 2.0 as part of a broader simplification, but now that they are implemented and working it's not really helping us much to get rid of them.
130​21:36:24 <robla> so...perhaps we can agree to make the linking aspect off by default, without necessarily declaring them deprecated in 1.28
131​21:36:35 <cscott> i guess the question is whether you think we can slim down wikitext incrementally by turning off these weird crufty bits one by one, or whether we should treat it as a completely new markup language
132​21:36:48 <cscott> and use something like "parsoid2" to do round-trips between "wikitext 1" and "wikitext 2.0"
133​21:37:03 <subbu> what i think of wikitext 2.0 is probably different from what cscott thinks of wikitext 2.0 probably :)
134​21:37:15 * marktraceur may have been wrong, the minnpost article he must have seen was just a random summary of Gopher
135​21:37:27 <cscott> i imagine a language with a formal grammar, small enough to fit on a single sheet of paper.
136​21:37:36 * robla doesn't want to rabbithole on that topic, when we might actually be able to make a decision about Mediawiki 1.28
137​21:37:36 <cscott> but otherwise looking as much as possible like wikitext 1.0
138​21:37:36 <subbu> marktraceur, ah ... now it makes sense gopher came form UMn ... and that is why gopher
139​21:37:53 <marktraceur> cscott: How big is the piece of paper? Implementation details...
140​21:38:13 <robla> can we turn magic words off by default in 1.28 without deprecating?
141​21:38:27 <cscott> and so, listing 28 different external link prefixes and 12 or so separate productions for ISBN/RFC/PMID is not going to help wikitext 2.0 fit on a single sheet of paper
142​21:39:16 <robla> er...I suppose I should have said "magic links"
143​21:39:28 <robla> can we turn magic links off by default in 1.28 without deprecating?
144​21:40:01 <subbu> sure .. i guess the discussion is more about whether deprecation is required / desirable.
145​21:40:12 <cscott> https://github.com/wikimedia/parsoid/blob/master/lib/wt2html/pegTokenizer.pegjs.txt#L449 is the grammar for RFC/PMID/ISBN in parsoid, for reference.
146​21:40:46 <cscott> as far as i'm concerned the real question is: can we get the *wiki communities to start rewriting content to use whatever our preferred replacement markup is?
147​21:40:46 <robla> subbu: yes, but that seems to be sending us down the rabbithole of talking about Wikitext 2.0, which we aren't going to finish in this hour
148​21:41:06 <subbu> personally, i don't think we need to talk about wikitext 2.0 there.
149​21:41:20 <cscott> https://en.wikipedia.org/wiki/Template:ISBN was created by MZMcBride
150​21:42:01 <cscott> and it seems to already be used by quite a number of pages
151​21:42:15 <TimStarling> as a compromise, how about not deprecating it immediately, but revisit after a year or two and see how many people are turning on the option?
152​21:42:31 <subbu> i think it is a worthwhile discussion in its own merit. but, like cscott i don't have strong feelings ... but yes, it will simplify some code in parsoid .. but i won't be heartbroken if it is kept as is.
153​21:42:37 <cscott> one intermediate step might be to hack the parser to add a parser warning if you use magic link syntax, suggesting an appropriate rewrite
154​21:42:46 * robla likes TimStarling's suggestion
155​21:43:08 <cscott> we can do that w/o committing to deprecation, just encouraging people not to use that syntax.
156​21:43:14 <bd808> THen we'd just need a way of collecting that data
157​21:43:19 <cscott> and then, as TimStarling suggests, give it a year or so and see what our trends are.
158​21:43:30 <subbu> ya what bd808 says .. do we have a mechanism for collecting that data?
159​21:43:45 <cscott> we could probably hack together a dumpGrepper script that would collect repeatable numbers for long-term comparison purposes
160​21:43:47 <TimStarling> there's that pingback feature ori introduced...
161​21:44:06 <cscott> we could also add [[Category:Uses Magic Link Syntax]]
162​21:44:38 * robla looks for the link to the aforementioned pingback feature
163​21:44:44 <subbu> i know (or am i imagining it?) kaldari had strong feelings about magic links. curious if he is around.
164​21:44:49 <TimStarling> currently it doesn't send any configuration
165​21:45:31 <TimStarling> #link https://www.mediawiki.org/wiki/Manual:$wgPingback
166​21:46:03 <kaldari> My suggestion was to just kill the RFC magic linking entirely, as it's usefulness is very marginal
167​21:46:22 <kaldari> otherwise, I support making it configurable
168​21:46:42 <kaldari> and depricating
169​21:46:46 <bd808> I think that part has been done now. there's a feature flag for it
170​21:47:14 <bd808> and this discussion is about deprecation and removal from core
171​21:47:26 <TimStarling> the dynamic dates was removed, there is that precedent, but dynamic dates was never on by default and was rarely used by non-WMF wikis as far as we know
172​21:47:57 <robla> we could conceivably deprecate in 1.29 if the 1.28 release goes well
173​21:48:03 <TimStarling> whereas we've been told that magic links are regularly used
174​21:48:22 <subbu> ISBN in citations only from what I can tell.
175​21:48:31 <TimStarling> I mean outside WMF
176​21:48:35 <subbu> ah ..
177​21:48:40 <TimStarling> for WMF we can run bots, outside WMF not so much
178​21:48:57 <TimStarling> outside WMF we don't know if people even want to run bots, or if they like their magic links the way they are
179​21:49:04 <robla> the pingback feature would give us more certainty, but I think even just "did anyone complain?" might be a good enough test in this case
180​21:49:35 <bd808> or would anyone who would complain actually upgrade anyway...
181​21:50:35 <cscott> it would be nice if WMF published a bot framework and scripts for it w/ every major upgrade
182​21:50:53 <cscott> like how we upgrade database schemas, we could provide tools for external wikis to update their content
183​21:51:02 <bd808> I guess I'm not sure why it needs to die if there is a feature flag other than hypothetical future parser world that would like to not support the feature
184​21:51:18 <bd808> cscott: I think we call that pywikibot
185​21:51:22 <robla> I suppose deprecating in 1.28 would be the "be bold" way of doing it. we could be prepared to "undeprecate" if people hate that we turned it off
186​21:51:43 <cscott> bd808: sure, it's the "official" part and "with scripts for every major upgrade" which is missing.
187​21:51:47 <James_F> Do "off by default" changes need to go through the one-version-notice process?
188​21:52:07 <TimStarling> heh
189​21:52:14 <TimStarling> disable by default then silently remove in 1.29?
190​21:52:20 <James_F> Tut. :-)
191​21:53:05 <robla> James_F: I don't *think* we have a policy like that, but I suppose that may be what legoktm's other rfc covers in more depth
192​21:53:27 <bd808> cscott: I'm not sure I agree on "official" and WMF being intrinsically intertwined, but some suggested wikitext cleanup scripts would be an interesting thing to add when changing the syntax
193​21:53:38 <James_F> Sorry, yes, I more meant "would it be covered by the semi-in-practice policy?".
194​21:53:46 <robla> the deprecation rfc: T146965
195​21:53:47 <stashbot> T146965: [RfC] Deprecation policy for PHP code in MediaWiki - https://phabricator.wikimedia.org/T146965
196​21:53:52 <cscott> i'm always interested in process improvements to make it easier for us to evolve wikitext syntax
197​21:54:57 <TimStarling> so should we call the non-controversial parts of this approved?
198​21:55:13 * subbu cares less about syntax and more about underlying processing model / semantics except where the syntax gets in the way
199​21:56:09 <subbu> wfm .. but, explicitly listing out the non-controversial parts would be useful.
200​21:56:26 <TimStarling> #agreed 1. Disable the magic link functionality by default for the MediaWiki 1.28 release
201​21:56:58 <TimStarling> #agreed 2. Deprecate magic links on Wikimedia wikis (e.g. Wikipedia), providing alternatives for this functionality and tools to aid the migration
202​21:57:26 <robla> hmmm....I'd feel more comfortable with broader input on #2
203​21:57:42 <cscott> sure. and the controversial step 3 would be removing the magic link code from core (perhaps moving it to an extension)?
204​21:57:52 <robla> I suppose "start step #2" isn't yet controversial
205​21:58:01 <James_F> Yes.
206​21:58:10 <cscott> i'd propose 2(a) -- add a category and parser warning for magic links on wikimedia wikis, without officially deprecating the feature.
207​21:58:22 <James_F> cscott: To move to an extension it'd have to keep the horrible hooks into the PHP parser, though?
208​21:58:28 <cscott> then sit back and see if folks are getting the magic links cleaned up or not.
209​21:58:35 <TimStarling> #info no consensus on removing the functionality from MW or flagging its pending removal via deprecation
210​21:59:21 <TimStarling> no, step 3 is disabling magic links on WMF wikis
211​21:59:23 <cscott> James_F: yes, although in my big picture worldview we're adding a pluggable parser API, and gradually moving from the "old" PHP parser to a newer one.
212​21:59:32 * James_F nods.
213​21:59:34 <TimStarling> removing from MW core was 1b
214​21:59:45 <cscott> so the hooks would only stay in the legacy PHP parser. which, of course, you could keep running if you like and are willing to keep it maintained.
215​22:00:05 <bd808> that's one possible universe, yes cscott
216​22:00:06 <Scott_WUaS> (Thanks, All)
217​22:00:27 <cscott> bd808: i keep pushing the universe towards my platonic ideal of it. ;)
218​22:00:35 * cscott has to turn into a pumpkin
219​22:00:54 <bd808> I like the "everything in the core php app" universe better
220​22:01:03 <bd808> but that's why we have these chats
221​22:01:10 * robla plans to turn into a different type of squash
222​22:01:29 <bd808> its a dangerous time of year to be a pumpkin
223​22:01:37 <robla> srsly
224​22:01:46 <bd808> could be gutted and filled with candles at any point
225​22:01:47 <TimStarling> what was next week's RFC again robla?
226​22:01:57 <robla> next week we'll plan on talking about CREDITs file
227​22:01:59 * robla finds link
228​22:02:18 <robla> https://phabricator.wikimedia.org/T139300
229​22:02:20 <cscott> bd808: pluggable doesn't mean not monolithic or not php, btw
230​22:02:25 <cscott> just that things are decoupled
231​22:02:39 <cscott> so you can have a pure-PHP single-process markdown wiki if you like.
232​22:02:42 <TimStarling> ok, all done
233​22:02:45 <TimStarling> #endmeeting

Thanks all for participating in the meeting, and I'll file subtasks for implementing points 1 and 2, and get to work! :)

So, the Wikipedia communities have no word on this change ? This was a change proposed in Mediawiki 20 days and that's it?

So, the Wikipedia communities have no word on this change ? This was a change proposed in Mediawiki 20 days and that's it?

As I said on-wiki, people are welcome to present arguments for keeping magic link functionality. Magic links can cause confusion because they're different and inconsistent with other forms of wiki syntax such as {{templates}}, magic links often mis-link in the case of "RFC", and the behavior is often a source of unexpected behavior from the software (links appearing where people forgot they would).

I think all of us are interested if there are reasons to keep these magic links working, but templates are way more common and have more functionality such as link tracking and a centralized on-wiki process for getting updates. Plus Scribunto/Lua support. The input validation in https://en.wikipedia.org/wiki/Module:Check_isxn is probably better than in MediaWiki core currently. :-)

As far as I know, magic links weren't ever documented on mediawiki.org and are essentially very old Easter eggs. There seems to be very little support for keeping "RFC" working as a magic link, which leaves us with "ISBN" and "PMID". This list could be expanded to include others such as "ISSN", as you've proposed, but what's the point of doing this?

Why is a template (perhaps wrapped around a parser function, perhaps wrapped around an interwiki link, it doesn't matter) the wrong solution here? Why are magic links the right answer? Magic links are based on regex searching through blobs of strings and usually guessing that a link is wanted, unless marked with <nowiki> to disable it. Why not be explicit instead of implicit?

I am guessing having a plan for replacing all currently in use magic links with template'd links would be desirable.

In some pages this will increase the number of transcluded templates by a
lot.

I am guessing having a plan for replacing all currently in use magic links with template'd links would be desirable.

Yeah, there's no rush here, as far as I know. The task description calls for "deprecat[ing] magic links on Wikimedia wikis (e.g. Wikipedia), providing alternatives for this functionality and tools to aid the migration." We don't need to act hastily and I don't think anyone is suggesting that we do so.

In some pages this will increase the number of transcluded templates by a lot.

Sure. On some pages this will mean potentially using an ISBN template a few hundred times, for example https://en.wikipedia.org/wiki/List_of_Canadian_plays_(P%E2%80%93Z) somehow has over 590 ISBNs. Most pages, if they use magic linking, only do so once or twice, though.

We track template links in a binary fashion (i.e., a page either uses a template one or more times or does not use a template), this template should be pretty cheap to render, and we already end up using certain templates a few hundred times already, for example in the case of citation/reference templates on a large article. Some citation/reference templates accept ISBNs as a template parameter. And eventually we may move references to a more serialized format altogether such as Wikibase/Wikidata. None of this should be precluded by killing off magic link functionality.

Many comments in Phabricator (previously Bugzilla) have indicated that magic links aren't a feature that people particularly like or have an interest in maintaining. If someone wants to defend the three existing magic links and possibly adding more, now is the time to do so, in my opinion.

No longer as a 1.28 blocker as those parts were completed.

I am not a defender of magic sources.

This magic syntax breaks syntax understanding, since it is an unexplainable exception, also for newbies and scripts. Nobody knows why just these three, no other keyword support. It was an accident in the first years of Wikipedia, when no-one knew how the seed will grow, and should be remedied in the long run.

Easiest replacement for fast ISBN magic would be an interwiki /Special:Booksources/$1.

In German Wikipedia I would like to see replacement by templates.

They give opportunity to check validity of input value (PMID and RFC numerical, RFC perhaps with #fragment of section, no RFC number beyond 9999 yet). If invalid, throw project defined maintenance category and show appropriate error message.

Same for ISBN, but things are much worse than a simple parser function woud help. See T148274#2788217 for details. The cite book pendant already checks PMID and ISBN and files warnings and cats.

Some people started to replace the magic links by some other syntax (templates) on some wikis, and a strange case was also found : it seems that magic links is broken when there are several consecutive filling characters in the text.

For example, this version of Ahmad Shah Durrani has a broken magic link at the end of the Bibliography. I experimented a bit and found that the magic doesn't work when there are several consecutive filling characters in the ISBN (2 whitespace, 1 whitespace and a dash...). None of the following work for example : ISBN 978- 1-4907 - 1441-7 ; ISBN 978- 1-4907-1441-7 ; ISBN 978 14907 14417 ; ISBN 97814907 14417

I think we should fix that before removing the magic links because the articles using this broken syntax are probably not listed in the tracking categories, so the bot won't go over them...

Could someone say what the status of this is, please? There has been very little discussion about it on enwiki. Editors who don't use citation templates use magic links for ISBN and PMID. There are around 340,000 pages using the former and 6,000 the latter. They are easy and fast to use, and it seems a pity to get rid of them.

Could someone say what the status of this is, please?

The status of the MediaWiki RfC (https://www.mediawiki.org/wiki/Requests_for_comment/Future_of_magic_links) is "in draft", and it has not yet been listed as "Ready for discussion" at https://www.mediawiki.org/wiki/Requests_for_comment. Unless the plan is to open it for discussion only after it is a fait accompli, phabricating an implementation seems premature..

Could someone say what the status of this is, please?

See T145604#2695311 and https://lists.wikimedia.org/pipermail/wikitech-ambassadors/2016-November/001500.html

There has been very little discussion about it on enwiki. Editors who don't use citation templates use magic links for ISBN and PMID. There are around 340,000 pages using the former and 6,000 the latter. They are easy and fast to use, and it seems a pity to get rid of them.

Right, if you don't use the standard citation templates, adding ISBN/PMID links will get a little harder. Interestingly, out of the 6.4k pages that use PMID magic links, less than half are in articles (https://quarry.wmflabs.org/query/15293). ISBN is ~317k out of ~350k in mainspace (which I think we expected, hence T148274).

The status of the MediaWiki RfC (https://www.mediawiki.org/wiki/Requests_for_comment/Future_of_magic_links) is "in draft", and it has not yet been listed as "Ready for discussion" at https://www.mediawiki.org/wiki/Requests_for_comment. Unless the plan is to open it for discussion only after it is a fait accompli, phabricating an implementation seems premature..

That was out of date sorry, I've changed it to "accepted" now. But this has been open for discussion since mid-September when I announced it on wikitech-l (https://lists.wikimedia.org/pipermail/wikitech-l/2016-September/086515.html), there was a formal RfC meeting (announcement: https://lists.wikimedia.org/pipermail/wikitech-l/2016-October/086713.html), and then there was more discussion when I announced it on wikitech-ambassadors in November (https://lists.wikimedia.org/pipermail/wikitech-ambassadors/2016-November/001500.html). So saying that it wasn't open for discussion is wrong.

The parts that have been agreed to is that magic links should be considered deprecated on Wikimedia wikis, and we (MediaWiki core developers) should work with communities to provide alternatives for this functionality and tools to aid the migration. How the migration happens and what alternatives are available are still open to discussion. T148274 has seen some pretty good discussion for ISBN links.

Hi Legoktm, you wrote that it was open for discussion because several places were notified, but none of these are where Wikimedia writers tend to hang out, and it's the writers who rely on these links. Can we hold a more central discussion about whether the ISBN and PMID links should be deprecated, rather than assuming it's a done deal?

Hi Legoktm, you wrote that it was open for discussion because several places were notified, but none of these are where Wikimedia writers tend to hang out, and it's the writers who rely on these links. Can we hold a more central discussion about whether the ISBN and PMID links should be deprecated, rather than assuming it's a done deal?

Discussion is always ongoing on https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Future_of_magic_links and I'm more than willing to continue discussing use cases and figuring out alternatives. However I'm not "assuming it's a done deal" - it's been approved by the architecture committee (see "AGREE" points in T145604#2695311) that at some point in the future magic links are going to go away on Wikimedia wikis with specifics about timelines and alternatives still TBD. If you want to re-open that for discussion you should contact an ArchCom member to figure out what the process is.

Scott added a subscriber: Scott.Jan 7 2017, 2:01 AM

I agree with SV above. This is a major (and somewhat foolish) change. It should not be imposed by a technocracy.

jeblad added a subscriber: jeblad.Apr 15 2017, 1:37 PM

At first I thought this was a good idea, but not anymore. This type of magic formatting is a good thing. The problem isn't that it happen, but that it isn't smart enough. We should actively limit the number of weird templates the editors must remember. If we can add a little smartness to wikitext, then we should do it.

Whether architecture committee agree on something has virtually no bearing in this case, this is at its core about editing not about boxes. If the editing communities want a feature, then the devs should deliver that feature. It is the editors that is the customer in this case, not the architecture committee or any other dev.

At first I thought this was a good idea, but not anymore. This type of magic formatting is a good thing. The problem isn't that it happen, but that it isn't smart enough. We should actively limit the number of weird templates the editors must remember. If we can add a little smartness to wikitext, then we should do it.

I think this argument conflates the ease with which editors can insert formatted markup and whether we should support magic formatting. As noted repeatedly, these three magic links are anomalies. Editors expect to be able to use templates for formatting and it often causes confusion when editors notice that plain syntax has become a link. Templates can also be localized and have a built-in and supported means of tracking uses. Templates are what we use for nearly every other kind of similar inline formatting. Adding a little smartness to wikitext is historically and demonstrably where we get into trouble.

Adding a little smartness to wikitext is historically and demonstrably where we get into trouble.

Which is a unsubstantiated claim.

If I should make an unsubstantiated claim I would say that the current template solutions is what creates havoc, as it is very as they invites an editing style that creates extremely messy and unreadable markup. Don't replace simple markup with more messy and difficult to read and understand markup.

Not sure that the template markup "{{PMID|somenumber}}" is in any way messier and unreadabler than "PMID somenumber". The very existence of this task is kind of proof that the magic links can make problems. As well as a number of "RFC somenumber" making false positive links on enwiki.

Ghilt added a subscriber: Ghilt.May 23 2017, 12:11 PM

Since the vast majority of editors are non-coders, as well as potential new authors (especially older people), increasing the amount of special characters in the source text (e.g. for templates) is imho a step in the wrong direction. I wish more of mediawiki would work like magic words...

I think between all bad ideas which were discussed the MediaWiki development, removing magic links and especially ISBN is one of the worst. I do not consent with MZMcBride when he talks about confusion under editors because of the ISBN is suddenly linked. Editors and especially new editors are happy that this happens.

I do also contest PerfektesChaos' claim he would like to see the issue resolved by templates in the German WP. I am part of the German WP a good time longer than he and I can assure that that community is absolutely template-hostile, and they explicitely decided against Vorlage:ISBN and Vorlage:PMID in 2007. The main reason is that the German WP community dislikes, and part of it hates, template within "Fließtext" as they call the part of source text outside infoboxes and categories.

If you'll remove the magic links the following will happen (off course only if the community agrees to reinstate the formerly deleted template):

  • some editors will use the ISBN template and some will not
  • some editors will add templates which other editors had omitted
  • they will start editwars about the necessarity to link a specific ISBN (using template)
  • regular bot runs needed to fix where editors had been too lazy/scatty to use template

Or the outcome will be that what happened with doi: Most of doi's are standing in articles unformatted (unlinked) because of editors don't care. Jepp, I think that will happen to ISBN links. Most editors won't care and Wikipedia again goes one step user unfriendly.

Thus I call to revise the decision to remove magic links, I also call to reintroduce doi magic link and to introduce yet more useful magic links.

I do also contest PerfektesChaos' claim he would like to see the issue resolved by templates in the German WP. I am part of the German WP a good time longer than he and I can assure that that community is absolutely template-hostile, and they explicitely decided against Vorlage:ISBN and Vorlage:PMID in 2007. The main reason is that the German WP community dislikes, and part of it hates, template within "Fließtext" as they call the part of source text outside infoboxes and categories.

What about references? ISBNs are typically used as part of references, correct?

Does the German Wikipedia use <ref></ref> tags with templates? On the English Wikipedia, it's common to use references with templates such as <ref>{{cite book|isbn=1234}}</ref>, which can then accept the ISBN as an argument. This means no change to the markup of many pages, since only the underlying "Template:Cite book" needs to be potentially updated.

How do you explain to users that almost every citation identifier, including DOI, needs extra markup, except ISBN, PMID, and RFC? These three are the exception to otherwise consistent behavior. Why not remove the magical behavior and be explicit instead? If you look at https://en.wikipedia.org/wiki/Template:Citation/identifier#Usage, you can see many other kinds of citation identifiers that use wrapper templates such as {{ASIN}} or {{OCLC}}? Why is this bad? Why does it make sense to have the software try to parse the page and guess what is and is not a valid citation identifier?

You could also just write [[Special:BookSources/1234]] in the page instead of using a template such as {{ISBN|1234}}. Do you and other German Wikipedians dislike all wiki markup, including link markup? If so, perhaps VisualEditor is the answer?

Or the outcome will be that what happened with doi: Most of doi's are standing in articles unformatted (unlinked) because of editors don't care. Jepp, I think that will happen to ISBN links. Most editors won't care and Wikipedia again goes one step user unfriendly.

Your argument is that nobody will care enough to make links? This seems to point to the questionable value of having the links at all, if nobody is using them or interested in supporting them.

Ghilt added a comment.EditedMay 25 2017, 12:18 AM

Hi MZMcBride,
yes, ISBN are part of references. We generally use ref-tags, but citing templates are much less used than in en.wp. For example, in natural sciences afaik mostly a PMID- and DOI-converter is used (http://www.hbz-da.de/wikipedia/PMID2reference.php) instead of templates. This also reduces imho unnecessary code and increases readability of the source text. I wish magic links were extended to reduce the amount of special characters in the source text. And due to the necessity of mouse, trackpad or trackpoint for all operations besides typing words in the visual editor, it is much slower than writing wikitext. Which could be simplified with more magic links. So VE is only the answer for some. Most wikipedians i know personally (mainly active and very active editors) write wikitext. And i don't think his argument was that nobody would care to make links but most would prefer not to have to and some won't care to.

By the way, how would a bot do the job?

By the way, how would a bot do the job?

The idea is to have a bot do a one-time conversion. Then we can stop supporting these magic links and this magical linking behavior in MediaWiki core forever. RFC in particular has been very problematic since the wiki universe has its own concept of requests for comment that is distinct from the Internet Engineering Task Force's concept of RFCs.

Another nice advantage to templates is that we can then translate/localize these identifiers, instead of relying on the English "ISBN" or "PMID". For example, perhaps the abbreviation would be different in German or French or Spanish. By using templates, we also can track usage more easily.

I completely agree that {{template syntax}} and [[link syntax]] is ugly, but it's what we're already using everywhere, so I'm struggling to see a reason to continue supporting magic links or expanding their use.

A bot is currently running on the English Wikipedia to do this one-time conversion: https://en.wikipedia.org/wiki/Special:Contributions/Magic_links_bot. For wikis that do not want to do this conversion, the ISBNs and PMIDs will likely become unlinked at some point in the future. The options then will be to use link or template markup to make ISBN and PMID clickable, the same way we do with everything else currently, or to have the markup be unlinked. A third option might be a JavaScript hack, if the local wiki really were interested in setting that up and maintaining it.

Ghilt added a comment.EditedMay 25 2017, 12:53 AM

what we're already using everywhere

Tradition by itself is imho not a very strong argument. And a one-time conversion, does it mean we have to manually correct missing ISBN- and PMID-templates in future edits? I guess, the RfC isn't used in de.wp, as we have polls and requests for decisions instead. And, scnr, do i have a chance of convincing you of the good that is in our child?

patilise added a subscriber: patilise.EditedJun 19 2017, 4:14 PM

Hope I am not too late to chip in a reply -

! In T145604#3290707, @MZMcBride wrote:
How do you explain to users that almost every citation identifier, including DOI, needs extra markup, except ISBN, PMID, and RFC?

Why is an explanation needed? For average users they probably only want to see them linked, and that's it.

Your argument is that nobody will care enough to make links? This seems to point to the questionable value of having the links at all, if nobody is using them or interested in supporting them.

We don't have unlimited manpower. I am from the Japanese Wikipedia, where the user base is broad but the number of people able to (let alone interested to) support is really thin. Simply being "different and inconsistent" (or other technical reasons) is not a valid argument to cut a working feature and increase the already-too-heavy workload. If it is not like MediaWiki will break down tomorrow, can you please revise this decision to drop magic links?
(Adding one footnote: using a bot increases the number of revisions and uses up some manpower so that doesn't change my argument.)

In T145604#3290707, @MZMcBride wrote:

How do you explain to users that almost every citation identifier, including DOI, needs extra markup, except ISBN, PMID, and RFC?

Why should there be a need to explain that? That is, how it works, period.

Do you and other German Wikipedians dislike all wiki markup, including link markup?

It's not me, but some people want a clean, easy readable source text, and they dislike the VE either. However as Ghilt wrote, the usage of citation templates in DE:WP is not common, less than 10 percent of all articles use them, and there even exist users which remove them and convert them in plain link citations. What is much more needed is making en:Template:Citation a global template so it works in each and every language version. It are the same people blocking each development of citation templates (de:Vorlage:Citation is full protected for years!)

Your argument is that nobody will care enough to make links?

Not really. My argument is that they won't care enough to make links which formerly had been generated automatically, i.e. they will create the link only if it has a value for their own interest as in an article, for instance. They won't if it would be only courtesy, f. ex. in a discussion. My argument is also that people won't care to add such links if other editors did not add them.

In T145604#3290882, @MZMcBride wrote:

Another nice advantage to templates is that we can then translate/localize these identifiers, instead of relying on the English "ISBN" or "PMID".

Well, ISBN is ISBN everywhere on the world. Besides, localization has been the worst idea WP developpers implemented so far. Why?

  1. Users who want to use localized syntax still need to learn English syntax when they want to work on commons – [[Datei:Irgendwas.jpg|mini|hochkant|Irgendeinbild]] won't work on Commons.
  2. There are thousands of useless edits each and every day because of useless users convert File: to Datei: and thumb to mini, and those users spam watchlists and make it more difficult for authors to detect vandalism.

If it was me I would remove all syntax localizations. However, I like the usage of centralized templates with localized output, e.g. en:Template:Infobox UK settlement and de:Vorlage:Infobox Ort im Vereinigten Königreich use the same syntax of course the appearance is in the respective language. (That, basically, is the believe, that for example the Italian Wikipedia is most comprehensive for Italien settlement so DE:WP uses their (the Italian) syntax, and it uses the French WP syntax for French settlements infoboxes. Sorry I made a disgression but I felt it was needed to advert the impression I was some kind of fundamentalist ;-)

The explanation is needed because otherwise the feature is unused or people make bad markup for things that don't have the feature.

The explanation is needed because otherwise the feature is unused or people make bad markup for things that don't have the feature.

That's not a valid argument. If the feature is removed it will be unused as well. Or the other way around, so far it isn't possible to not-use the feature as ISBN <numbers> is marking up the number as ISBN link. Also "people making bad mark-up" is no argument. There are two possibilities: the ISBN a user typed is correct ( -> good markup) or the ISBN is wrong ( -> still good mark-up but the number does not exist). "ISBN 1234" is creating a bad ISBN-Link, as "1234" is not a valid ISBN. However, {{ISBN|1234}} still creates the same bad link. There is no difference between them. And the last part of your argument isn't valid as well. Actually thats kind of [[:en:WP:POINT]]; and most major language versions won't accept it. There is no reason to remove a feature because of another feature does not exist. The question shoudl be wether that other feature link should be a magic link.

For make it clear: IMHO we need much more magic links, not the other way around.

I was just contesting the claim that people will figure out by themselves how these things work.

"ISBN 1234" is creating a bad ISBN-Link, as "1234" is not a valid ISBN. However, {{ISBN|1234}} still creates the same bad link. There is no difference between them.

On the English Wikipedia, https://en.wikipedia.org/wiki/Template:ISBN uses https://en.wikipedia.org/wiki/Module:Check_isxn for input validation. If a user adds {{ISBN|1234}} to a page, the page will be automatically added to https://en.wikipedia.org/wiki/Category:Pages_with_ISBN_errors and the user is informed with red error text.

An "intentionally" invalid ISBN is handled by a separate template, as noted at https://en.wikipedia.org/wiki/Template_talk:ISBN#Allow_hiding_.22Invalid_ISBN.22_error.

Besides, localization has been the worst idea WP developpers implemented so far.

This is wrong, period. I suggest you read https://www.mediawiki.org/wiki/Principles, where #2 is "internationalized, with equal support for all languages;"