Page MenuHomePhabricator

Special:Linksearch should default to all protocols (protocol-less column for externallinks)
Open, LowPublicFeature

Description

Special:Linksearch should default to all protocols instead of just http. This quirk is documented in the default MediaWiki:Linksearch-text. As it stands, this allows linkspammers to evade scrutiny by placing https links.

Example query: Special:Linksearch/*.spam.example.com should pick up the link to https://spam.example.com/blah I placed in my personal sandbox, but I have to search Special:Linksearch/https://*.spam.example.com to pick up this link.


Version: unspecified
Severity: enhancement

Details

Reference
bz12810

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:06 PM
bzimport set Reference to bz12810.
bzimport added a subscriber: Unknown Object (MLST).

Currently you'd have to do separate queries for every possible protocol, and paging the list wouldn't be cleanly possible.

To work cleanly, another index field would have to be added to the externallinks table which doesn't include the protocol.

Extensions is now part of MediaWiki core (1.14alpha) -> changing product and component

In fact, when pagination occurs, even if you place a domain without protocol, pagination links convert the domain to http:// automatically. For example, try following the next page at https://www.mediawiki.org/wiki/Special:LinkSearch/commons.wikimedia.org and see how http:// is added to the domain in the search input field.

It would be really nice if this got a bit of love. It is 13 years old, and with so many links being https by default it is becoming a pussy bedsore of a problem

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:01 AM

Please fix. If you can really only handle one protocol by default, it should probably be https://

@Certes: Please feel free to provide a patch if you'd like to get things closer to getting fixed. Thanks.

As a note, it is trivial to work around the protocols issue in link search using Special:Search with insource: and regex e.g. a query with insource:theguardian insource:/theguardian\.com/ returns about 154k pages, divided into 147k secure HTTP links and 17k insecure HTTP links.