Page MenuHomePhabricator

Provide an alternate mapping for bad words
Closed, ResolvedPublic

Description

The shortener will eventually start generating bad words in various languages (for example http://w.wiki/FU).

Filtering out bad words has a lot of problems:

  • Generating all bad words for every language
  • Updating that list with yet-to-be invented bad words
  • Existing bad words in the system can't be deleted without blacklisting the URL they point to
  • Deciding for the user which words are offensive, or if they will be inappropriate in a given context (e.g. the filter list might need to be stricter for use in schools)

One way to work around this would be to provide users with an alternative mapping if they click a button saying "this URL contains an offensive word".

We could introduce a new character not already in the map, e.g. underscore _. Any URL beginning with this would use the existing mapping but in reverse order or shuffled, so w.wiki/rudeword becomes w.wiki/_Q6efgome. This second URL would silently redirect through the rudeword without ever showing it to the user.

The chance of both URLs being offensive words would be infinitesimally small.

Event Timeline

Change 568680 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/UrlShortener@master] Provide an alternative encoding for every ID using a fixed prefix

https://gerrit.wikimedia.org/r/568680

Change 568680 merged by jenkins-bot:
[mediawiki/extensions/UrlShortener@master] Provide an alternative encoding for every ID using a fixed prefix

https://gerrit.wikimedia.org/r/568680

Change 634937 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] Add _ to the allowed list of short url characters

https://gerrit.wikimedia.org/r/634937

Change 634937 merged by Ema:
[operations/puppet@production] Add _ to the allowed list of short url characters

https://gerrit.wikimedia.org/r/634937

Ladsgroup assigned this task to Esanders.

This is done