Page MenuHomePhabricator

Detect if a page is presented as machine translation
Closed, ResolvedPublic

Description

When a page is served to user by a machine translation service, most probably outside a wikipedia domain. There should be standard way to detect that context.

  1. It should be translation service independent if possible
  2. Should be able to identify the source language, target language
  3. Should be able to identify the machine translation service

This may require some customization at the external services as well.

Event Timeline

In the case of Google translate, we have

  • <meta http-equiv="X-Translated-By" content="Google"> inserted to the <head> of the translated page. This is a strong indicator that the page is translated and also gives the service name
  • The lang attribute of the <html> is changed to the target language. But that is a week indicator since mediawiki pages rendered in an interface language different from content language also has that lang attribute. By comparing its value with wgContentLanguage and wgUserLanguage, we may be able to detect an external change to the lang attribute.
  • If the document.referrer is a translation server that the extension knows and configured, it tells that the content is translated and the service name. But does not tell to which language.