Page MenuHomePhabricator

When inserting link in VE, automatically convert referer links copied from Google search results into actual/proper URLs
Open, LowPublic1 Estimated Story Points

Description

To quote from T130506: Automatically convert referer links copied from Google search results into actual/proper URLs:

In dewiki (and enwiki) many users try copy urls from their google searches directly into an article. Unfortunately those google urls, e.g.,

https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0ahUKEwjci9-m6p3LAhVrIpoKHcf3BjQQFggdMAA&url=https%3A%2F%2Fwww.mouser.com%2Fpdfdocs%2FAND8424-D.PDF&usg=AFQjCNENO76mctNPySoFd_vrceHLGYtKZw

which acutally leads to https://www.mouser.com/pdfdocs/AND8424-D.PDF

  • are tracking our user's browsing behaviour,
  • could be used to cirvumvent the spam-blacklist, see T34159
  • are ugly

That's why

\bgoogle\..*?\/url\?.*

is blocked globally at https://meta.wikimedia.org/wiki/Spam_blacklist.

This block at meta irritates normal users (e.g. w:de:WP:FzW (german)). The edit filter (extension AbuseFilter) would not help here.

A better solution would be that the mediawiki software itself would convert such google-urls to the original urls automatically. So If somebody adds an url like ...google.de/url... it will be replaced by ...www.example.org/original_url...

These converted urls could also be easier checked against the spam-blacklist (T34159).

The link insertion widget in VE seems to be a good place to do this conversion. While this obviously only works for editors using VE and thus doesn't solve the original task, it seems to be easy to do the conversion there, and especially on wikis where VE is the default editor many (and many unexperienced) users are likely to benefit from the conversion there.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Interesting idea. I'm a bit worried about the potential for this to get out of hand (every line of code slows down the user experience), but I can see the merit, too.

There any other common easy-transforms like this? It's an easier sell if it's various use cases, rather than one site-specific thing. Granted, "google search results" is a pretty high-profile and common thing so there's an argument there...

Most of what comes to mind is the general case of "un-shortening shortened URLs" (bit.ly, t.co, goo.gl, etc), which is unfortunately not doable in-browser because it's a matter of loading the URLs and following redirects until you get to a stable page. (With some degree of needing to actually load it in a browser and follow HTML meta redirects if being thorough.) Technically, we could do it, but it'd be a matter of introducing a new service (citoid-esque tell-me-about-a-link).

Jdforrester-WMF moved this task from To Triage to TR1: Releases on the VisualEditor board.
Jdforrester-WMF set the point value for this task to 1.