Page MenuHomePhabricator

Ignore unknown query parameters in canonical url
Closed, DuplicatePublic

Description

Right now if you go to a page (including a content page) with a query string that query string goes into the canonical url string. For example:

<link rel="canonical" href="http://en.wikipedia.org/wiki/Schadenfreude?cat" />

should be

<link rel="canonical" href="http://en.wikipedia.org/wiki/Schadenfreude" />

This has caused issues where search engines have indexed bad urls (with people's name or libel threats in the query string) which should obviously never have gotten into their system in the first place but when they check to see if it 'exists' they are getting an answer of 'yes' (as far as they are concerned).


Version: 1.24rc
Severity: normal

Details

Reference
bz65016

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 3:08 AM
bzimport added a project: MediaWiki-General.
bzimport set Reference to bz65016.
bzimport added a subscriber: Unknown Object (MLST).
Krinkle renamed this task from ignore query string in canonical url to Ignore unknown query parameters in canonical url.Jun 18 2015, 12:42 AM
Krinkle set Security to None.
Krinkle removed a subscriber: wikibugs-l-list.