Page MenuHomePhabricator

Clean up "easter egg" short URLs before extension goes live
Closed, DeclinedPublic

Description

At time of writing, w.wiki/b redirects to the Hungarian Wikipedia article for Budapest, w.wiki/m redirects to the English Wikipedia article for Mexico City, and w.wiki/q redirects to the English Wikipedia article "Queer" (in the LGBT+ sense). This strikes me as odd, because I would have thought it would be much more appropriate for those to redirect to Wikibooks, Meta-Wiki and Wikiquote respectively to match the interwiki table.

Since the extension is still yet to be activated for everyone, I think it would be appropriate to prune the single-character URLs on or before 11 April so that it remains possible for the WMF to use most of them for new projects and initiatives in future without breaking a lot of incoming links.


Possible issues with the "Easter egg" links

  • The links may be perceived to have symbolic value regardless of the original intention, and they may appear to be reflective of strong systemic biases (e.g. under-representation of women, over-representation of Europeans)
  • The one-character Easter eggs have already been archived to the Wayback Machine, so there is not as much value in keeping them active just for the sake of it (the information will not be lost if the links are changed)
  • The relevance of most of the Easter eggs will diminish as time passes, and they may potentially become viewed as a conspicuous sign of the software's age
  • Some of the links are clearly presented as having a purpose (convenient access to important WMF projects/initiatives), whereas others are random
  • People may unsuccessfully try to use these links to get to Wikibooks and the five other projects which don't have one-character links (Wikiquote, Wikispecies, Incubator, Meta, MediaWiki)
  • It may be in WMF and community interest to use one-character strings for Wikibooks, etc., as well as for future projects/initiatives (possible branding/outreach benefits)
  • It will be more disruptive to change the links after the extension is enabled, because incoming links will be broken
  • News sites and blogs, particularly those which are tech- or Internet-focused, often cover Easter eggs of other popular websites (e.g. Google), so if someone finds the Easter eggs then there may be coverage which closely examines these links
  • Wikipedia readers who use w.wiki/w would easily discover the Easter eggs, so they could quickly become highly visible outside the Wikimedia community

Possible methods of removing or replacing Easter eggs

  • Software removal – stewards can delete links, but cannot change them
  • Database entry removal – changing links in the database results in severe caching issues
  • Database wipe – [unconfirmed]
  • Reinstallation of Extension:UrlShortener – [unconfirmed]

Possible outcomes

  • Easter eggs are kept; ticket is closed with no further resolution
  • Easter eggs are removed by stewards and/or WMF staff, but cannot be reused in future without technical changes
  • Easter eggs are manually changed in the database, causing caching issues
  • (other)

Details

Related Gerrit Patches:
mediawiki/extensions/UrlShortener : masterAdd URL creation script

Event Timeline

Base awarded a token.Apr 9 2019, 6:31 AM
Jc86035 added a comment.EditedApr 9 2019, 6:47 AM

This discussion indicates that changing URLs directly in the database causes severe caching issues, but is the only way to change existing URLs. This may contradict the information given on the main information page, which indicates that stewards have the ability to delete all links. Clarification would be appreciated.

I tested the short URLs for all printable ASCII characters:

charASCIIHTTPtarget
space20404
!21404
"22404
#23404
$24301https://donate.wikimedia.org/
%25404
&26404
'27404
(28404
)29404
*2A404
+2B404
,2C404
-2D404
.2E404
/2F400
030404
131404
232404
333301https://www.wikimedia.org/
434301https://www.wikidata.org/wiki/Q42
535301https://en.wikipedia.org/wiki/Wikipedia:Five_pillars
636301https://phabricator.wikimedia.org/T183647#3871427
737301https://www.wikidata.org/wiki/Lexeme:L7
838301https://www.wikidata.org/wiki/Q8
939301https://phabricator.wikimedia.org/T44085
:3A404
;3B404
<3C404
=3D404
>3E404
?3F404
@40404
A41301https://fa.wikipedia.org/wiki/%D8%A2%D9%84%D9%86_%D8%AA%D9%88%D8%B1%DB%8C%D9%86%DA%AF
B42301https://de.wikipedia.org/wiki/Bier
C43301https://fr.wikipedia.org/wiki/Croissant_(viennoiserie)
D44301https://en.wikipedia.org/wiki/Darth_Vader
E45301https://en.wikipedia.org/wiki/Easter_egg_(media)
F46301https://en.wikipedia.org/wiki/FOSS
G47301https://pl.wikipedia.org/wiki/Gda%C5%84sk
H48301https://he.wikipedia.org/wiki/%D7%97%D7%99%D7%A4%D7%94
I49404
J4A301https://he.wikipedia.org/wiki/%D7%99%D7%A8%D7%95%D7%A9%D7%9C%D7%99%D7%9D
K4B301https://ko.wikipedia.org/wiki/%EA%B9%80%EC%B9%98
L4C301https://en.wikipedia.org/wiki/LGBT
M4D301https://en.wikipedia.org/wiki/MediaWiki
N4E301https://en.wikipedia.org/wiki/NetHack
O4F404
P50301https://en.wikipedia.org/wiki/Jean-Luc_Picard
Q51301https://www.wikidata.org/wiki/Help:Items
R52301https://en.wikipedia.org/wiki/Dennis_Ritchie
S53301https://sv.wikipedia.org/wiki/Stockholm
T54301https://zh.wikipedia.org/wiki/%E8%87%BA%E5%8C%97%E5%B8%82
U55301https://en.wikipedia.org/wiki/URL_shortening
V56301https://en.wikipedia.org/wiki/V_for_Vendetta
W57301https://en.wikipedia.org/wiki/Wikipedia
X58301https://en.wikipedia.org/wiki/42
Y59301https://en.wiktionary.org/wiki/why
Z5A301https://de.wikipedia.org/wiki/Z%C3%BCrich
[5B404
\5C400
]5D404
^5E404
_5F404
`60404
a61301https://en.wikipedia.org/wiki/Alan_Turing
b62301https://hu.wikipedia.org/wiki/Budapest
c63301https://commons.wikimedia.org/
d64301https://www.wikidata.org/
e65301https://en.wikipedia.org/wiki/Easter_egg
f66301https://en.wikipedia.org/wiki/Free_software
g67301https://gerrit.wikimedia.org/
h68301https://en.wikipedia.org/wiki/Planck_constant
i69301https://en.wikipedia.org/wiki/Wikipedia:Ignore_all_rules
j6A301https://en.wikipedia.org/wiki/Jabberwocky
k6B301https://en.wikipedia.org/wiki/Boltzmann_constant
l6C404
m6D301https://en.wikipedia.org/wiki/Mexico_City
n6E301https://www.wikinews.org/
o6F301https://ores.wikimedia.org/
p70301https://phabricator.wikimedia.org/
q71301https://en.wikipedia.org/wiki/Queer
r72301https://en.wikipedia.org/wiki/R_(programming_language)
s73301https://wikisource.org/
t74301https://www.wiktionary.org/
u75301https://www.wikiversity.org/
v76301https://www.wikivoyage.org/
w77301https://www.wikipedia.org/
x78301https://en.wikipedia.org/wiki/Project_Xanadu
y79301https://hy.wikipedia.org/wiki/%D4%B5%D6%80%D6%87%D5%A1%D5%B6
z7A301https://de.wikipedia.org/wiki/Konrad_Zuse
{7B404
vertical bar7C404
}7D404
~7E404

(The two 400 error codes, for / and \, were caused by making the requests through the Internet Archive. Normally they're 404s.)

Quite frankly, you could probably write a whole newspaper article about how the "Easter eggs" alone are able to show a systemic bias towards Europe or the northern hemisphere or the English language or the English Wikipedia or males or white people or software developers or the 20th century or science fiction. (There are no women in this list! For reference, the Wikidata gender gap is at about 82:18.) Alone, I don't think they're terrible choices for Easter eggs, but if it's technically feasible I do think reserving these for other purposes would be better.

Michael added a subscriber: Michael.Apr 9 2019, 8:28 AM

Copying my answer on Meta:
Just like Q items on Wikidata, it is possible to delete a link, but not to recreate it or reassign it to another URL. The first URLs have been created in order to test the system before it is available for every users. The main point was to link to different wikis, to make sure that the feature works as expected. A new round of testing will fix the gender bias - more women have been included. Once the feature is deployed, everyone will be able to create short URLs and the first ones that were created won't be so important anymore.

Jc86035 added a comment.EditedApr 9 2019, 9:26 AM

As a hypothetical idea, would it be prohibitively difficult to empty the database (or even reinstall the extension) and recreate the appropriate links, if it is desirable to remove the Easter eggs?

Jc86035 added a comment.EditedApr 9 2019, 9:33 AM

I agree that this isn't extremely important, although the first "testing" URLs could be seen to have or to indicate certain symbolic value (hence my earlier comment proposing that it would be seen to reflect systemic biases). Similarly, the very lowest-numbered Wikidata items (universe, human, happiness, etc.) appear to have strong symbolic value and are reflective of the developers and their intentions, although those items might have a higher visibility than these URLs would.

If the intention is that the shortener will be used by readers (i.e. the general public), then I think to a certain extent it does matter how these are presented, in addition to my earlier concern about the potential use of these "shortest" URLs to link to new projects/initiatives. The links could potentially end up being more visible than Wikidata's first items, given how much "look at this funny trivia" coverage Easter eggs of other popular websites can get (e.g. List of Google Easter eggs).

Jc86035 updated the task description. (Show Details)Apr 9 2019, 9:59 AM
Jc86035 updated the task description. (Show Details)Apr 9 2019, 10:09 AM
Jc86035 updated the task description. (Show Details)Apr 9 2019, 10:13 AM
Jc86035 updated the task description. (Show Details)
Jc86035 updated the task description. (Show Details)Apr 9 2019, 10:15 AM
Jc86035 updated the task description. (Show Details)Apr 9 2019, 10:40 AM
Restricted Application added a project: Internet-Archive. · View Herald TranscriptApr 9 2019, 10:40 AM
Jc86035 updated the task description. (Show Details)Apr 9 2019, 10:42 AM

As I've noted on the Meta talk page, a method should be found to set up memorable, 2- or 3-letter custom links for each Wikipedia (say /en for en.Wikipedia; /de for de.wp, /arz for arz.Wikipedia etc) and other projects.

Jc86035 updated the task description. (Show Details)Apr 9 2019, 2:17 PM

I'm going to defer to @Ladsgroup since he's taken over the main deployment, but I do think we went a bit overboard on the easter eggs. My initial plan was to have no more than 10 joke/easter egg style ones, and then allocate "b" -> wikibooks, "c" -> commons, etc.

I'm going to defer to @Ladsgroup since he's taken over the main deployment, but I do think we went a bit overboard on the easter eggs. My initial plan was to have no more than 10 joke/easter egg style ones, and then allocate "b" -> wikibooks, "c" -> commons, etc.

The point is I couldn't assign anything to "c" for commons unless I already assign everything before it (which is around ~25 URLs). So lot's of them are not Easter egg in that sense, they are for test to make sure everything works (for example while deploying the first batch I realized mediawiki.org is not whitelisted). That's why we had such a diverse type of links (like phabricator comments to articles in wikis with non latin script and it's not just articles in English Wikipedia).

Jc86035 added a comment.EditedApr 9 2019, 4:57 PM

@Ladsgroup Does the extension not allow the creation of custom links altogether? I'm confused by this, since it was evidently possible to create https://w.wiki/$.

No, it does not. $ is just another one of the characters that are used for the short code (the 58th, I believe, perhaps ±1), so to create that short code in particular, all the other one-character codes also had to be filled (because $ is the last character in the list).

Jc86035 added a comment.EditedApr 9 2019, 5:09 PM

Out of interest, is there any particular reason $ was added to the list, and was there a reason 2 was omitted? (I can't find any documentation on how these parts of the extension work.)

If it is desirable to hide the "test" URLs, then I think the most practical solution at this point would be to delete the ones which don't redirect to project home pages using the interface, considering the current limits of the extension.

Just my two cents, but wouldn't it make sense to reserve all one-, two-, and three-letter URLs for future use? The extension could start the counter at four-letter codes (like 'aaaa' or something) and leave the shorter ones open. That would allow the single letter ones to be used for important things like projects, and the two- and three-letter ones to be used for language codes and such.

If there have only been a few generated so far, would it be too much trouble to clear the existing ones and start over?

Just my two cents, but wouldn't it make sense to reserve all one-, two-, and three-letter URLs for future use?

Yes, this looks like a good idea - start issuing actual short URLs from 4 characters, and permanently reserve 1-3 letter short URLs for internal use. Then we can decide what we do with them much more freely I think.

Just my two cents, but wouldn't it make sense to reserve all one-, two-, and three-letter URLs for future use?

Yes, this looks like a good idea - start issuing actual short URLs from 4 characters, and permanently reserve 1-3 letter short URLs for internal use. Then we can decide what we do with them much more freely I think.

It's not possible. The chars are based on mapping of auto_increment PK in tables. I can't jump over them and assign them later.

can't jump over them and assign them later.

Why not? AFAIK these columns are allowed to be non-contiguous (otherwise you couldn't delete things) and can be easily set to any value. From what I understand about auto_increment columns, if you create element with value X, then next one would be X+1:

After the auto-increment counter has been initialized, if you do not explicitly specify a value for an AUTO_INCREMENT column, InnoDB increments the counter and assigns the new value to the column. If you insert a row that explicitly specifies the column value, and the value is greater than the current counter value, the counter is set to the specified column value.

So if we create a value in that column which equals strlen($wgUrlShortenerIdSet)**3 then all following IDs would be 4 characters long. It can be also manually set by running something like ALTER TABLE tbl AUTO_INCREMENT = 100;

Am I reading the wrong manual?

Assigning later also should be possible to do by direct insert into the table, not?

can't jump over them and assign them later.

Why not? AFAIK these columns are allowed to be non-contiguous (otherwise you couldn't delete things) and can be easily set to any value. From what I understand about auto_increment columns, if you create element with value X, then next one would be X+1:

After the auto-increment counter has been initialized, if you do not explicitly specify a value for an AUTO_INCREMENT column, InnoDB increments the counter and assigns the new value to the column. If you insert a row that explicitly specifies the column value, and the value is greater than the current counter value, the counter is set to the specified column value.

So if we create a value in that column which equals strlen($wgUrlShortenerIdSet)**3 then all following IDs would be 4 characters long. It can be also manually set by running something like ALTER TABLE tbl AUTO_INCREMENT = 100;
Am I reading the wrong manual?
Assigning later also should be possible to do by direct insert into the table, not?

Well, It's not as easy as looks, for example you need to explicitly set the table to accept PK on insert which our DBAs might not like and it endangers integrity of the data. Plus, all of this require manual query execution on production database (like I log into x1 master and manually change things in production) which can be justified if we had an emergency/data corruption or other serious issues, not Easter eggs.

for example you need to explicitly set the table to accept PK on insert

The link you provided talks about SQL Server. We're not using SQL Server AFAIK?

which our DBAs might not like and it endangers integrity of the data

How does it endanger the integrity of the data? It's just an initializer, what's the different if the count starts with 1 or 1001?

Plus, all of this require manual query execution on production database

We can make a maintenance script for this. It's just a single DB INSERT statement, should be pretty easy. AFAIK there's no problem with maintenance scripts writing to the database, it's done every day.

Smalyshev added a comment.EditedApr 9 2019, 7:34 PM

The script, excluding the boilerplate, should be roughly this:

$dbw = self::getDB( DB_MASTER );
$url = 'https://meta.wikimedia.org/';
$rowData = [
	'usc_id' => $newId,
	'usc_url' => $url,
	'usc_url_hash' => md5( $url )
];
$dbw->insert( 'urlshortcodes', $rowData, __METHOD__, [ 'IGNORE' ] );

If you want to, I can make the whole script with all the boilerplate, shouldn't take long. BTW, can use same script to create easter eggs, if you'd like ;)

Just my two cents, but wouldn't it make sense to reserve all one-, two-, and three-letter URLs for future use?

Yes, this looks like a good idea - start issuing actual short URLs from 4 characters, and permanently reserve 1-3 letter short URLs for internal use. Then we can decide what we do with them much more freely I think.

I agree we should do this, otherwise we miss a huge opportunity for short urls here. Another (but hacky) possibility would be to modify the encoder to omit these IDs so we can decide what to do with them later on.

Krinkle removed a subscriber: Krinkle.Apr 9 2019, 10:41 PM

Do we really need to reserve 729,000 URLs for special use? Reserving 1-letter URLs makes sense, and maybe 2-letter, but 3-letter seems overly cautious, IMO.

I don't expect them all to be used. This is mostly if we wanted to make shortcuts for 3-letter language codes, like arz. But if this is not desired then doing just 2 letters is fine.

Change 502683 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/UrlShortener@master] Add URL creation script

https://gerrit.wikimedia.org/r/502683

This thread makes me cry. We are not opening a domain registry or something like this here. We are not going to install a bidding or voting process for the most wanted IDs. These IDs are effectively random. You get the next ID that is free the moment you use the service. And then it's gone. There is no going back. There is no reassigning, no veto, no discussion. That's not the purpose of a URL shortener. If you need a meaningful, self-explaining URL, than for heavens sake use the original URL.

Get over it.

The best solution I can see at this point is to ignore this thread and continue as planned, with the easter egg URLs in place.

  • Ditching them would create the horrifying situation where people start running and fighting over the first few hundred short URLs, and whoever is able to run a bot fast enough will get all of them, resulting in an even more unbalanced situation.
  • It would be possible to change the code so that 1, 2, and 3 character shortcuts just do not exist.
  • Another idea is to change the code so it works more like YouTube, where every ID does have the same length, and they are random instead of sequential. But this rewrite would block the deployment of this highly wanted feature for an even longer time, if it is even possible.
Jc86035 added a comment.EditedApr 10 2019, 7:48 AM

We are not opening a domain registry or something like this here.

If it is possible to patch it so that it can reserve shorter URLs for internal purposes (i.e. WMF developers set them), and a patch is already in progress, and multiple people have voiced support for this sort of change, would it still be preferable to retain the status quo?

Get over it.

Is there anything to get over? Multiple other users have stated that they would prefer that changes be made to address issues with the current implementation. I am not particularly offended by the choice of Easter eggs (in spite of the issues I have noted in the task description and in previous comments), but regardless, reserving all of the shortest URLs for links to projects would have usability benefits and would be in line with the current use of several of the existing URLs as typeable redirects to projects' home pages.

We can make a maintenance script for this. It's just a single DB INSERT statement, should be pretty easy. AFAIK there's no problem with maintenance scripts writing to the database, it's done every day.

According to @jcrespo at IRC: "you can jump over ids with no problem, what you cannot do is insert values with lower than the last insert id, if you are doing that you should not use autoinc"

can't jump over them and assign them later.

Why not? AFAIK these columns are allowed to be non-contiguous (otherwise you couldn't delete things)

Side note – deleting short URLs doesn’t delete the row, it just sets the “deleted” flag. (Otherwise you couldn’t undelete them.)

jcrespo removed a subscriber: jcrespo.Apr 10 2019, 7:54 AM
jrbs added a comment.EditedApr 10 2019, 7:56 AM

Personally, while I think the easter egg URLs are generally cute, they have almost zero rhyme or reason right now so I'm not sure they should remain as they are. I like the idea of using two-letter codes for language projects though you obviously run into the issue of only being able to link to one project in that language that way (e.g. w.wiki/de would presumably go to de.wikipedia.org, not de.wiktionary.org, etc).

I would greatly appreciate people not using language like "Get over it" in an environment such as Phabricator - there's no need to cause drama or division over something as trivial as easter egg urls.

Edit: To clarify - I don't really care one way or the other if they remain baked in like this ;) I just think that if I was the one making the call, they probably would serve other functions, but I don't really know what those functions would be.

I like the idea of using two-letter codes for language projects though you obviously run into the issue of only being able to link to one project in that language that way (e.g. w.wiki/de would presumably go to de.wikipedia.org, not de.wiktionary.org, etc).

Perhaps some more patterns could be reserved, e.g. dewk (dewt?) for the languages of the various non-Encyclopedia projects.

Pigsonthewing added a comment.EditedApr 10 2019, 8:31 AM

Do we really need to reserve 729,000 URLs for special use? Reserving 1-letter URLs makes sense, and maybe 2-letter, but 3-letter seems overly cautious, IMO.

If we don't do this, then a large number of potentially very useful three- (and possibly two-) letter codes are going to be burned up by people trying out the tool, using their favourite (or a random) article or their user page.

revi removed a subscriber: revi.Apr 10 2019, 9:02 AM

It is quite common that a software feature is developed for a certain purpose, but later the end user do find it helpful for more or even entirely different applications.

  • The original target has been to generate random codes beginning somewhere at one letter, with no special meaning.
  • It turns out that it is very useful to reserve 4 letter codes for mnemonic shortcuts like wikt or voya or whatever; leaving 2 and 3 letter codes for languages of Wikipedia as most frequented WMF project, one letter code like b c d v for (english) project entries, and 4 letter codes for various internal assignments not conflicting with language codes.
  • Random shorteners start at aaaaa and have no meaning at all. That length is quite common for short URL and not difficult to type and write on a piece of paper or tell by phone.

If we don't do this, then a large number of potentially very useful three- (and possibly two-) letter codes are going to be burned up by people trying out the tool, using their favourite (or a random) article or their user page.

Thanks Andy for summarizing the issue, I feel like the core need expressed in this discussion lies here. Can we try to estimate how important this need is? I have the feeling that the need to have specific short URLs linking to projects appeared along the way, and was not part of the initial user story. Moreover, we already have short URLs linking to Wikimedia projects, for example enwp.org.

As a reminder, this feature used to be stalled for a long time in the depths of Phabricator, and thanks to a few people who started working actively on it again, we are now about to have it live. Is the need crucial enough to delay the deployment, rewrite the whole structure of the product to get the feature allowing to assign URLs, and risk that this great improvement falls down in the "todo later" for a long time?

Jc86035 added a comment.EditedApr 10 2019, 10:15 AM

I did consider creating a new task to address the reservation of URLs other than the 1-character short URLs which are currently assigned, since the task was originally only about changing the Easter eggs, but I decided not to since it would be disruptive to the conversation.

Addressing the original issue at this point in time, without reserving those short URLs, would only require the Easter eggs to be deleted/hidden, which is supposed to be possible in the interface. I think doing this for now, if that is desirable, would be appropriate; however, preventing two- and three-character URLs from being created would require a software change. I'm not familiar with the extension or most of the other software that it relies on, but my impression (having looked at the Gerrit commit) is that at first a software change preventing one-, two- and three-character short URLs from being created by end users would not be overly disruptive and could potentially occur without the deployment being delayed. The "convenience" links could then be enabled at a later date after discussion on what exactly they should point to; for me, at least, they are not absolutely crucial but would count as usability improvements.

From the technical side, reserving URLs is not possible. We could decide to start incrementing from URLs with 4 letters, but then the previous ones, those with 3 letters, would be unavailable for later use: it would not be possible to assign or reassign them later.

Is it impossible because it is prohibitively difficult to patch the code, or for some other technical reason? We know it's definitely possible in theory, because, well, existing URL shortening websites can create custom URLs, so we can't just claim that it's not possible (or not desirable).

That's not the purpose of a URL shortener.

Is the purpose of a URL shortener unequivocally defined somewhere? If it's useful, it's useful.

It is impossible because of the current structure of the database. Earlier in this ticket, a person from the DBA team confirmed this. More technical details can be provided if needed.

I'd like to go back to the main purpose of this feature: to provide a URL shortener to be used in the Wikimedia projects, because all the other URL shorteners are blacklisted for security reasons. For example, long links generated by the Wikidata Query Service. So the main goal of this URL shortener is to be used directly in the Wikimedia projects.

However, in the Wikimedia projects, different ways already exist to create links to other projects in a quick way: not only the existing short URLs like enwp.org, but the templates [[d:...]], {de:wikt:...]] etc. Adding more to the new URL shortener would not solve any problem that is not already covered by these existing methods.

At that point, I don't see any evidence that this request is important enough to consider rewriting completely the structure of the database and the code, potentially delaying for several months the release of this feature that plenty of people in the communities are looking forward to.

If adding new functionality (i.e. changing the database structure) is out of the question:

  • Would it be appropriate to temporarily increase the minimum character length, to leave open the possibility of creating the links to project home pages?
  • Otherwise, would it be appropriate to remove the Easter eggs using the extension's interface?
Smalyshev added a subscriber: jcrespo.EditedApr 10 2019, 3:43 PM

According to @jcrespo at IRC: "you can jump over ids with no problem, what you cannot do is insert values with lower than the last insert id, if you are doing that you should not use autoinc"

This sounds strange, I checked multiple times on my setup and there's absolutely no problem back-inserting any values in MySQL column with ID smaller than last insert ID. Is this some peculiarity of our replication setup? @jcrespo could you comment on that?

jcrespo removed a subscriber: jcrespo.EditedApr 10 2019, 3:58 PM

Backup systems, schema changes, monitoring, distributed HA and replication automation and consistency, as well as other operational tasks may rely on monotonically increased auto_increment keys. Please do not subscribe me back (my subscription here was not a voluntary one, and I can't be involved on this ticket at the moment, use Wikimedia-Rdbms or DBA to bring the attention of someone else that may attend you).

Change 502683 abandoned by Smalyshev:
Add URL creation script

Reason:
jscrespo: Backup systems, schema changes, monitoring, distributed HA and replication automation and consistency, as well as other operational tasks may rely on monotonically increased auto_increment keys.

https://gerrit.wikimedia.org/r/502683

Reedy added a comment.Apr 10 2019, 4:05 PM

Do we really need to reserve 729,000 URLs for special use? Reserving 1-letter URLs makes sense, and maybe 2-letter, but 3-letter seems overly cautious, IMO.

If we don't do this, then a large number of potentially very useful three- (and possibly two-) letter codes are going to be burned up by people trying out the tool, using their favourite (or a random) article or their user page.

See also Q1-100 on Wikidata.

Sorry, not sorry.

jrbs added a comment.Apr 10 2019, 4:17 PM

I'd like to go back to the main purpose of this feature: to provide a URL shortener to be used in the Wikimedia projects, because all the other URL shorteners are blacklisted for security reasons. For example, long links generated by the Wikidata Query Service. So the main goal of this URL shortener is to be used directly in the Wikimedia projects.
However, in the Wikimedia projects, different ways already exist to create links to other projects in a quick way: not only the existing short URLs like enwp.org, but the templates [[d:...]], {de:wikt:...]] etc. Adding more to the new URL shortener would not solve any problem that is not already covered by these existing methods.

I don't think enwp.org is futureproof though, is it? Do we know who owns it? (Apologies if we do and I'm just being silly)

This discrepancy in understanding (whether this is shortening wikimedia-related stuff or long urls in general) is quite large and doesn't seem to be shrinking :) With the obvious caveat that it's not much of my business, I am not sure how many people will actually be interested in the usecases you're referring to (shortening query URLs), but I suppose if that number is at least more than 1 it's worth doing.

kaldari added a comment.EditedApr 10 2019, 4:26 PM

I'm with @thiemowmde. I originally filed the URL shortener bug, created the RFC, and have been following its progress for 7 years. At no time did I or anyone else list "reserved URLs that are easy to remember" as a requirement (AFAIK), and no one's really made a good argument about why they're needed. I say launch it as is and let people start using it.

I don't think enwp.org is futureproof though, is it? Do we know who owns it? (Apologies if we do and I'm just being silly)

See https://phabricator.wikimedia.org/T32861 and https://phabricator.wikimedia.org/T88859

Restricted Application removed a subscriber: Liuxinyu970226. · View Herald TranscriptApr 11 2019, 9:04 AM

Since the task was declined and the extension has been deployed, I have been using a shell script to generate short URLs for all of the project homepages. It should be done in about two hours.

MJL awarded a token.Jun 5 2019, 6:42 AM