Blockers for Wikimedia wiki domain renaming
Open, Needs TriagePublic

Description

The domains of several Wikimedia wikis should be renamed: T21986.

This was done once, when be-x-old.wikipedia.org was renamed to be-tarask.wikipedia.org, but this renaming exposed several issues. Performing any more renaming is not advised until these issues are resolved. This task tracks these issues.

Amire80 created this task.Jul 29 2017, 1:05 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 29 2017, 1:05 PM
deryckchan added a subscriber: deryckchan.

@deryckchan I have no idea on why you give this token, is there anything that makes you disappointed?

It's intended as encouragement: fire for the engine, or a burning desire to
make things happen!

I think that token (and fire in general) is generally associated with destruction but sure.

(off topic) While I sometimes use them, I don't really understand what tokens are for. We could just live with like/love; dislike/heartbreak and "the world burns" and get rid of the whole lot of others which do not really add anything meaningful. Or just drop all tokens. I wouldn't really mind. If you have to say something, say it on the task. That said, let's get back on the task and ignore the tokens.

Liuxinyu970226 added a subscriber: Nsaa.EditedSep 18 2017, 4:25 AM

PS: By investigating T173602 it looks like that some community members like @Nsaa still oppose the renaming because reasons like α. those codes are used in our Wikimedia servers for years to decades? β. those renaming actions can break SQL/SPARQL? γ. It's too hard to "fix" those urls on paper materials? and others. So how to encourage communities to accept could also be a blocker IMO.


Here are my comments that about those so-called oppose concerns:

  1. renaming database names are really the most hardest actions in our Ethernet world, and shouldn't be mentioned on this task (though, if you have knowledge on how handling it you are feel free to answer T83609)
  2. fixing langcodes in wikitexts/lua modules/local interface messages can just be a lot of bot works, anything that really should ask at Phabricator for those?
  3. regarding fix codes on paper materials, I would suggest to get help from some (mainly) Chinese-made stuffs "Correction Tapes (修正带)" or "Correction Fluid (涂改液)"

...

(3) doesn't concern the vast majority of requests because the old URLs would be kept as domain name and interwiki aliases.
e.g. [[yue:頭版]] and https://yue.wikipedia.org/wiki/頭版 already work, but they redirect to https://zh-yue.wikipedia.org/wiki/頭版 and the rename request simply reverses these redirects.

I believe the SPARQL part of (2) can also be handled by aliasing, though we need to work with the Wikidata developers to get SPARQL and interwiki linking to work.

jeblad added a subscriber: jeblad.Sep 18 2017, 1:01 PM

Slightly off-topic but the nowiki community would probably accept a move if old URLs will work for some time. I also believe that the community at both nowiki and nnwiki will accept an intercept of page requests for th no-domain a few years into the future, with a manual rerouting of the request to a nn- or nb-domain. The only thing they probably won't accept is broken URLs.

Nsaa added a comment.EditedSep 18 2017, 6:09 PM

@Liuxinyu970226 I'm only opposing renaming of the no.wikipedia.org to nb.wikipedia.org. What other project decides to do I have no strong opinion on.

Slightly of-topic: @jeblad claims " the nowiki community would probably accept a move if old URLs will work for some time. ". This subject has been up many times and no, there is no consensus of moving the project to another language code. (See the last round here https://no.wikipedia.org/wiki/Wikipedia:Avstemninger/Prefiks vote 85 to keep no versus 39 to move to nb).

Remember no.wikipedia.org covers the Norwegian language and since 2005* it has been covering two norms including some 90 % of the usage of Norwegian; bokmål as normed by Språkrådet (https://en.wikipedia.org/wiki/Language_Council_of_Norway) and riksmål, the de facto standard most norwegians follow as normed by Det norske akademi (https://en.wikipedia.org/wiki/Norwegian_Academy_for_Language_and_Literature). We cover the Norwegian language at no.wikipedia.org, and in respect to the nn.wikipedia.org that left us in 2004 or something we do not write nynorsk.

The language situation in Norway is complex and has been a big fighting ground for 100 year. We have a stable situation on the project and it's pity that this has again come up as an issue. We should be writing articles and make no.wikipedia.org better, not handling unneccessary noise like this.

If someone will start a nb.wikipedia.org I suppose that is up to the Foundation to accept. But no.wikipedia.org stays at no.wikipedia.org (until eventually Foundation make a board case on it and force move us to nb. (since they are the owners of the domains)).

Remember there are a lot of reasons why we should not move:
https://meta.wikimedia.org/wiki/Talk:Requests_for_comment/Rename_no.wikipedia_to_nb.wikipedia#No_relocation_from_nowiki_to_nbwiki

Liuxinyu970226 added a subscriber: Mdennis-WMF.EditedOct 8 2017, 12:33 AM

@Nsaa To answer your list of common aganist reasons, I worked hard on analysing ICANN, IETF, IEEE, ITU, ... resources in the past 8 days, and now (I removed your ref tags since those are shown U+FFFD characters on my browser):

'External links' – nowiki has established a great value of links that link to the page. At present it is up to 1.3 million such links. How will we get external stakeholders to update over 1,300,000 link to us with Yahoo containing no.wikipedia.org?

Maybe this would be a bug regarding SEO.

'Brand' no.wikipedia.org – NO domain coincides with the country code no, and that no has been in use since this wiki was created. Thus, it is advantageous to retain the familiar prefix no, that all Norwegian speakers have a relationship with, versus the totally unknown nb, who barely the most educated language people know.

Again, we use ISO 639, not ISO 3166, are we discussing the SAME ISO standard? "it is advantageous to retain the familiar prefix no" So again that nowikibooks, nowikinews, nowikisource, and potential nowikivoyage are bokmål only?

'Visibility' – All links from articles on no.wikipedia.org will no longer count as much for Google's PageRank algorithm (one can assume) if you do not 'permanent' add correctly redirectkoder the .com domain. The proposal allows for 'remove 301 redirect' after 5 years.
'Visibility and value' – 'Britannica boss' Jorge Cauze say the following on Wikipedia If I were to be the CEO of Google or the Founders of Google I would ask very [displeased] That the best search engine in the world continues two provideh as a first link, Wikipedia,

by those reasons, you're still opposing ALL Wiki-Setup (renaming) requests, and what's the reality of ".com"? How is that important?

'Index and traffic figures' . A switch will 'reduce the main page's importance' . At present, the traffic was 57 & nbsp; 694 hits, it amounts to 5.04 & nbsp;% of traffic on nowiki. Traffic on nnwikis index is pr. Day 1 & nbsp; 496 hits, it amounts to 1.37 & nbsp;% of traffic on nn. Sources: Source nn (archive nn 2009-01-26) source en (archive en 2009-01-26)

This looks rather like a bug (hence ask SEO) than a good argument

'Uncertainty' – Likewise, we have no control over what other search engines and others who follow links do with this kind of redirects and to what extent this has a negative effect.
'Uncertainty' – there is no agreement on what one possibly to do with .no domain afterwards, ergo it is inherent in a great danger that external links will no longer pointing directly to our articles and main compartment.

still, the macrolanguage problem, I believe Estonian, Latvian and Lithuanian could also have such problems...

'Brand' – no domain is built up throughout the Norwegian speaking population consciousness as the site of Wikipedia in bokmål and riksmål.

Still, you still still and still not provided which articles are examples on the de facto nowiki which are writting in riksmål, only provided two enwiki articles that about orthography which are nonsense under reviewing resources criteria.

'Technical' – That it is technically a not insignificant job to implement (many bots must clean incredible amounts, one must set things up properly by developers on Wikipedia's servers – we'll use these scarce resources to such policy changes like this?).

I don't believe that you can't search what @Krenair said many times before, the only "hard" thing is renaming database names which is no longer a topic of this, but T83609

'Bias of bokmål / riksmål' – indirectly proposal an attempt to impartiality in distributing languages ​​printed on no (bokmål and riksmål), a language used mainly by the vast majority (In a survey with a sample of over 4,000 people came forward with the following "7.5 % responded that they write only nynorsk, 5.5% that they write about the same amount in both language variants, and 86.3% that they write riksmål / bokmål "oNLY 7 , 5% nyNorsk ( archived 2009-01-09 ). Bottom line then it's over 90% of the population of writing that uses bokmål / riksmål to put that in perspective). It will be quite discriminatory to destroy all the value created on no.domenet already at a relocation simply because a driver with semantic argument that also no-ISO code comprises the entire riksmål.

Now you claim that most nowiki articles can also be considered as riksmål, and bokmål is just riksmål, with those "resources".

'For all eternity .no domain is unusable for other purposes' – With a permanent relocation by. En:301 Redirect avoids possibly some of the problems with external links, but it will make no.wikipedia.org domain busy forever.

But if without renaming domain, the nb.wiki* can be permanently one of 404/405/504, how is 301 a big problem than those three? Just count the number of error codes here.

'ISO code no is more correct than nb for this wiki' – Officially Bokmål is normed by no:Språkrådet, while riksmål normed of no:Det Norske Akademi for Sprog og Litteratur. In ISO description of nb says bokmål and riksmål is not defined in this. no covers the Norwegian language and are therefore both national target and on Bokmål under no-ISO code, but not directly under nb-ISO code, at least not name terms. Thus riksmål and moderate bokmål (since this is not used by the Official Languages ​​Council sticks to radical forms) will be an immeasurably [ http://www.ordnett.no/ordbok.html?search=forfordele&search_type=&publications=23 (in meaning 1)] by moving from uk to nb.

By this claim, I could say that Persian is just Persian, not Dari, no dialects problems, btw an ask.com discussion (I missed the entire URL of it) says that in sometime, the "no" can also contain series of Sami languages, so if you'd love to keep no.wiki*, please include Sami just in those no.*, not creating those on Incubator, Okay?!

'Wikimedia should not not make a change, which entails serious consequences just to satisfy a small minority' – The move does not apply to equate two equal languages. We're not Wikipedia for Norway, but Wikipedia in Norwegian (bokmål, nynorsk, riksmål others). Here we are not official bias of language. 90% of the population uses bokmål/riksmål and then mainly the moderate form. Thus the argument that no recording a domain which should also cover nynorsk correct, but weak. There are many varieties of German, but they just follow the new orthography (not Low German, Swiss German, Austrian German etc.). I think I see that dewiki being accused of discriminating against them.

"not not" = just do it, and please be aware that Swiss German has dialects too, and alswiki contain 4 of em. It would be in case that eswiki will face-to-face dialects of es in many countries problems by the same claim (still, which dialect/orthography the Spanish-Sites are following?).

'Definition Power' – The impact of vertical search engines: The 'relative position' to nn will be improved (ergo choose people nn articles instead nb article to a greater extent). Thus acquires nn more of definition power (Is it called the no:Vedavågen or Veavågen mm)

still, the SEO bugs

(added after poll) 'External links on paper'. There is no one who knows how many (permanent) links that are currently operating pressure and after the proposed five-year period will no longer pointing to that content.
(added after poll) 'External links to papers. By a shift will no longer (permament) links be like. After five years they will probably not pointing to the correct content. This is unfortunate set in an academic perspective (hampers reference check).

copied from my T172035#3613648 above

regarding fix codes on paper materials, I would suggest to get help from some (mainly) Chinese-made stuffs "Correction Tapes (修正带)" or "Correction Fluid (涂改液)"

or just, as deryck said above, doesn't concern the vast majority of requests, because you can just re-print them after domain renaming, only waste a little of inks in ink cartridge/selenium in toner cartridge.

(added after poll, 2011-08-14) 'ownership' to no.wikipedia.org gained by everyone who has helped in no.wikipedia.org on the articles per currently exist. Each of the users has mixed his work into these articles and it will in practice mean that every one of all the user needs to be requested and accepting delivery of this landed right. Presumably, only a Board decision in Wikimedia Foundation that can move the project since it is they who own the domain formally.

@Mdennis-WMF is this really? there's holds up from WMF bords?

(added after poll, 2011-08-14) 'W3C' strongly recommend that you do 'NOT' modify URLs.

W3C also suggests to let browsers support Audio track selection, (likely Video-), MPED-4 ASP, H.265 (surveillance-controller?)... but how are those must be supported on nowadays market-of-browsers?
On the other hand, IETF RFC 1035 implied to suggest to rename that in a necessary period.

(added after poll, 2011-08-14) Adding riksmål under bokmål is very wrong when bokmål covers almost the entire Norwegian written language, even the many nynorsk forms. Riksmål has a unique hundred year history with many of the leading cultural porters of the Norwegian language. To illustrate how remote on Bokmål may be from riksmål we can take this example

You said many times that nowiki articles are riksmål but now you say the opposite to YOUR SELF, psst.

Please continue discussions about the merits of renaming particular
languages' / language groups' sites on their respective tasks. This thread
is about the issues blocking the actual act of renaming.

Liuxinyu970226 added a comment.EditedOct 8 2017, 12:25 PM

Please continue discussions about the merits of renaming particular
languages' / language groups' sites on their respective tasks. This thread
is about the issues blocking the actual act of renaming.

Well, I'm just pointing some common advantages for renaming domains, nothing is specifically above.

C933103 added a comment.EditedDec 9 2017, 12:59 PM

Is it possible to do the renaming task for those wikis as of current status first and then deal with whatever bugs that would appear after the renaming is to be done? As mentioned by others, it've been almost a decade since the issue was <del>raised</del>submitted, and there will be more and more legacy issue need to deal with the longer it drags on (CX didn't even exists back in the day). Things like CX would be broken but those seem to be less important.

Edit: Alternatively, how about closing a project into incubator and then reopen it with the desired language code immediately?

@C933103:

Is it possible to do the renaming task for those wikis as of current status first

If that is possible, then we don't need this task, we can just rename those wikis, even that results a large number of database conflicts.

and then deal with whatever bugs that would appear after the renaming is to be done?

(the main topic of this task)

As mentioned by others, it've been almost a decade since the issue was raised

Those "raising" actions are illegal, please see Bug management/Phabricator etiquette, especially:
Report status and priority fields summarize and reflect reality and do not cause it. Read about the meaning of the Priority field values and, when in doubt, do not change them, but add a comment suggesting the change and convincing reasons for it.

and there will be more and more legacy issue need to deal with the longer it drags on (CX didn't even exists back in the day).

Full list of that thing is needed to investigate

Things like CX would be broken but those seem to be less important.

So if we don't work on those tasks before e.g. 2050, then which "broken" is much more inappropriate to you?

Edit: Alternatively, how about closing a project into incubator and then reopen it with the desired language code immediately?

I was suggesting that as an alternate way of T25216, but @Verdy_p doesn't agree it, and still suggests to do somethings on CNAME (which is also stucked per T133548).

T133548 is about certificates for HTTPS. if we create a CNAME of a subdomain project to another subdomain, it should have no impact if the certificate allows not just specific subdomains but its parent domain (e.g. wikipedia.org): there are hundreds of subdomains and having one certificate for each one is costly.

But may be there are reasons I don't know why you don't want to use "wildcards" for allowing all subdomains (e.g you have subdomains actually delegated and managed by third parties on their own servers and administrators, but inb my opinion they should not be within *.wikipedia.org but only in *.wikimedia.org; or you want to create distinct subsets of authoritative DNS for several languages for managing the deployments in your server farms, or for legal reasons if some wikipedias follow different copyright policies and need separate trustships with separate certificates or for international issues with some countries that want to block some subdomains)

But may be you could then renew just each certificate by including in them both the old CNAME'd subdomain and the new one. Finally you may not want CNAMEs if they cause issues in your front caching proxies (duplicate cache storage for actually the same page from two distinct subdomains).

Another reason is possibly the configuration of your webservers not accepting different "Host:" in HTTP/HTTPS requests, as your servers actually are used to host multiple virtual web servers and need it to know which site to render.

The advante of CNAME is that it does not force clients to perform sucessive requests to follow a redirect (by HTTP result code at best, or by Javascript at worst): How were subdomains renamed and aliased in the past ?

I'm just curious why then T133548 is blocking also the CNAME aliasing.

C933103 added a comment.EditedDec 9 2017, 5:02 PM

As mentioned by others, it've been almost a decade since the issue was raised

Those "raising" actions are illegal, please see Bug management/Phabricator etiquette, especially:
Report status and priority fields summarize and reflect reality and do not cause it. Read about the meaning of the Priority field values and, when in doubt, do not change them, but add a comment suggesting the change and convincing reasons for it.

Sorry, a better term would be, "since the issue was submitted".

Things like CX would be broken but those seem to be less important.

So if we don't work on those tasks before e.g. 2050, then which "broken" is much more inappropriate to you?

(There are no workable MTL for yue/nan/etc anyway so CX serve little purpose there, so it's probably only as important as any random widget, but I can't speak for others about the priority on which is more important.) The langlink query things seem to affect more things but its impact can only be determined based on the progress of further increase in reliance on wikidata. As for other potentially undiscovered problems, there are probably no way to know it without first exposing it?

Liuxinyu970226 added a comment.EditedDec 15 2017, 2:12 PM

@Verdy_p:

The advante of CNAME is that it does not force clients to perform sucessive requests to follow a redirect (by HTTP result code at best, or by Javascript at worst): How were subdomains renamed and aliased in the past ?

But it has also disadvante, at least that method will still blocks creation of Narom contents under the formal Wp/nrm/*, and so we have to "setup a temporary code" which is at least I don't wanna see. (Please, believe me, until we could have time to do FULLY exporting-and-importing db dumps from nrmwiki dbname to nrfwiki dbname ON THE ACTUAL Machine of s3 we don't yet know how to unlock it, as we don't have a whitelist way of WikimediaIncubator.class.php;c34aeead7cadeedf906247512ea76ec6eacba73a$386.)

Verdy_p added a comment.EditedDec 15 2017, 2:47 PM

Narom is another unrelated problem: "nrm" was an incorrect code for Norman and no aliasing between "nrm" and "nrf" should be kept if we want to have Narom contents.
So it's impossible to preserve the links to Norman contents except transitionally (but this must have an end).
Of course "nrm" must be completely freed to leave space to Narom (however we've still not seen for now any interested community in creating contents for Narom)
And subdomains is not really a blocker for creating contents in Incubator (where the language would only be a path).

All past contents on Incubator for Norman (previously using "nrm") can very easily moved to "nrf". The issue is elsewhere: in templates still assuming "nrm" means Norman, in interwiki links, in Wikidata, and in Translatewiki.net: these must all be changed to free the incorrect "nrm" code they use or assume. And this work cannot be made only by Norman contributors (who don't care about Narom) or by Narom contributors (who don't care about Norman): it has to be done administratively (after approval by the language comity) and taken by contributors in any Wikimedia projects working proactively for this cleanup.

As long as we don't do that, the only way to create Narom contents in Incubator (or later for a localized wiki) will be to use a private use code (such as "x-narom"), meaning that Narom will not be considered a plain language equal to others even if it has equal treatment in ISO 639-3. Such private use code may be acceptable in Incubator (Wp/x-narom, but there may be issues with Incubator templates to recognize it as a valid language code), but certainly not for a localized wiki project and also not for preparing the core UI translations needed in Translatewiki.net that will be needed first before a Wp project is created (and it would not avoid the later needed renaming to use the standard code).

Note: there's an alternative: ISO 639 approved the "nrm" code for Narom but as far as I know it has never been really used. Narom could be alllocated by ISO another distinctive ISO 639-3 code and could then deprecate "nrm" (this has already occured for other languages). If "nrm" is deprecated, then it could remain associated to Norman in Wikimedia (note that "nrf" is really a language code assigned for Continental Normal, and not the 3 major variants for France, Jersey, Guernsey, and Jersey may also requests its own distinctive code for its official language; in that case "nrm" would a legacy macrolanguage, but deprecated, and another standard code could be assigned to the new macrolanguage, leacing "nrm" deprecated and used privately by Wikimedia).

However Wikimedia should not pressure the ISO wroking group like this: it has to do its own local job for cleaning up the situation. For now the use of "nrm" as a subdomain for Norman is conforming (subdomains are not restricted to be language codes) just like the Wikimedia interwiki prefixes. But we know that these Wikimedia interwiki prefixes and domain name labels are not always language codes (e.g. "simple" would then be "en-x-simple" if Wikimedia really conformed to BCP47 standard for these interwiki prefixes and domain names for its own localization purposes).

Liuxinyu970226 added a comment.EditedThu, Jan 18, 1:13 PM

@Verdy_p

Narom is another unrelated problem

It is the same problem, that you still suggesting a "temporary deprecation", which won't help that Incubator link either

All past contents on Incubator for Norman (previously using "nrm") can very easily moved to "nrf"... in Wikidata, and in Translatewiki.net: these must all be changed to free the incorrect "nrm" code they use or assume.

So leave SIL page as just "Guernésiais"+"Jèrriais" and without "Norman"? How is there having benefit to play double standard game?

meaning that Narom will not be considered a plain language equal to others

Then why don't you consider creating separated wikis for Guernésiais as nrf-gg.wikipedia.org and Jèrriais as nrf-je.wikipedia.org? Still, you're "double standard"ing

ISO 639 approved the "nrm" code for Narom but as far as I know it has never been really used.

Then which language https://incubator.wikimedia.org/wiki/Wt/nrm/fa%CA%94 is using?

Narom could be alllocated by ISO another distinctive ISO 639-3 code and could then deprecate "nrm" (this has already occured for other languages).

Changing the ISO codes themselves are simply impossible, if those can even be possible, then we can ask for re-assigning the Wawa (www) to e.g. wwe, so we have no problem between the potential different usages of "www" (but then you have to try distinguishing between World Wrestling Entertainment and Wawa).

If "nrm" is deprecated, then it could remain associated to Norman in Wikimedia (note that "nrf" is really a language code assigned for Continental Normal, and not the 3 major variants for France, Jersey, Guernsey, and Jersey may also requests its own distinctive code for its official language; in that case "nrm" would a legacy macrolanguage, but deprecated, and another standard code could be assigned to the new macrolanguage, leacing "nrm" deprecated and used privately by Wikimedia).

Jèrriais - Language Status: 8a (Moribund) < Narom - Language Status: 6b (Threatened). Still a good comment?

However Wikimedia should not pressure the ISO wroking group like this

[self-published source?]

subdomains are not restricted to be language codes

[clarification needed] [examples needed] [according to whom?] [by whom?] [dubious] [neutrality is disputed] [vague] and [needs update]

But we know that these Wikimedia interwiki prefixes and domain name labels are not always language codes

[when defined as?] [weasel words] and [not in citation given]

"simple" would then be "en-x-simple" if Wikimedia really conformed to BCP47 standard for these interwiki prefixes and domain names for its own localization purposes

no need to add "-x" since "en-simple" is just confirming your standard.