Page MenuHomePhabricator

Make Web2Cit support web servers which addapt to "Accept-Language" request headers
Open, Needs TriagePublic

Description

Some web servers may return different responses based on the Accept-Language header sent by the client. Some may even perform a redirect.

Citoid does handle the Accept-Language header sent by the client, forwarding it to the target. Web2Cit currently doesn't handle this header.

As a result, the results returned by Citoid and by Web2Cit's fallback template may considerably differ in these cases. This also affects Citoid raw citations returned by Web2Cit-Server with option citoid=true (as used by Web2Cit-Gadget).

See for example what happens with this URL from www.independent.co.uk:

  • From a browser in English, the Citoid citations returned by (1) the original Citoid extension, and (2) the userscript-modified extension, match.
  • However, a browser in Spanish, the Citoid citation returned by the original Citoid extension is in Spanish (and points to a redirected URL), whereas the one returned by the userscript-modified extension is in English.

(I'm not being able to reproduce the redirection right now, but it was happening yesterday).

  • Web2Cit-Server should handle Accept-Language headers from the client and pass them to Web2Cit-Core: T308710
  • Web2Cit-Core should support sending Accept-Language headers to the target's webpage, and to the Citoid API: T308711
  • Web2Cit-Core should follow redirects before attempting to translate a target URL: T304773
  • Web2Cit-Core should support Header selection (to be used inside "Control" fields) to decide what translation template to use based on any Content-Language headers in the response: T304333
  • Web2Cit-Core should support URL selection (to be used inside "Control" fields) to decide what translation template to use based on URL patterns indicative of the content's language: T304326