Page MenuHomePhabricator

RelatedArticles constructs an unofficial page URL instead of amending an official one
Closed, ResolvedPublicBUG REPORT

Description

When you visit any article on Wikipedia on the Minerva skin and look at related articles, they have URLs like https://en.wikipedia.org/w/index.php?title=Damian_Fernando&wprov=rarw1 instead of using the regular article path. This is because it takes the page name and constructs a URL using mw.util.getUrl with the wprov parameter, instead of taking the page URL and appending the wprov parameter to it.

In T276737#11838460, where I also explained in more detail the minor inconvenience that prompted me to make this task, Krinkle talked about how every other user of the wprov parameter does it by taking an official URL and appending the parameter to it, which seems to be true from a cursory glance at Codesearch. I'm opening this task to separate these concerns.

Event Timeline

Change #1275482 had a related patch set uploaded (by KockaAdmiralac; author: KockaAdmiralac):

[mediawiki/extensions/RelatedArticles@master] Use canonical URL for article links.

https://gerrit.wikimedia.org/r/1275482

Jdlrobson-WMF subscribed.

This seems like a bug with mw.util.getUrl or a misuse of that function.

We should not be introducing our own URL parsing code in RelatedArticles IMO...

As Krinkle pointed out in T276737: mw.util.getUrl using params should use short urls if they exist, it's not a bug with mw.util.getUrl. It is indeed supposed to return paths relative to $wgScript when query parameters are passed, which ensures the generated URL is excluded by robots (in the Wikimedia environment). Other components that generate URLs with wprov in the same way that my change proposes are:

None of them, as far as I can tell, use mw.util.getUrl aside from RelatedArticles.

Thanks @KockaAdmiralac for the link. @egardner could Readers Growth take a closer look at this please since ReaderExperiments is already using this. In general seems like we should have a more standard way of adding wprov @EBernhardson @dcausse ?

@matmarex encountered similar issues and suggested a change in how we handle wprov when arriving on the landing page (see https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/1267282). Haven't found a way to reproduce this but did not look closely on what happens with the RelatedArticles extension. I'm all for more standardisation on this.

@KockaAdmiralac's change makes sense to me.

It's also possible to do approximately the same thing in code that can't make an API query like this:

const url = new URL( mw.util.getUrl( title ), location.href );
url.searchParams.set( 'wprov', 'rarw1' );
doStuff( url.toString() );

We could just declare that this is the standard way. You could maybe add a helper function to wrap this somewhere, but a) it's just 2.5 lines of code b) wprov is a somewhat WMF-specific thing and I'm not sure if such a helper belongs in core.

In PHP code you can use wfAppendQuery().

All that said, while this would be IMO the preferred way of constructing URLs with this param, you don't always control how they are constructed. wprov could be coming from an <input type="hidden"> in a form, or from a &returntoquery=wprov=... URL parameter to some special page. So we should still handle it when it's provided differently, and generate a pretty URL when removing it.

Change #1275482 merged by jenkins-bot:

[mediawiki/extensions/RelatedArticles@master] Use canonical URL for article links.

https://gerrit.wikimedia.org/r/1275482