RFC Meeting: Support language variants in the REST API (2016-04-27, #wikimedia-office)
ActivePublic

Hosted by daniel on Apr 27 2016, 9:00 PM - 10:00 PM.

Description

  • Location: #wikimedia-office IRC channel
  • Meeting type: Field narrowing
  • Time: Weekly, Wednesday 21:00 UTC (2pm PDT, 23:00 CEST)
  • Agenda:
    • T122942: RFC: Support language variants in the REST API

See the Architecture meetings page for more general information about this meeting (also: Phab query: list of upcoming RFC meetings, Phab query: list of all RFC meetings).

Recurring Event

Event Series
This event is an instance of E66: ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office), and repeats every week.
RobLa-WMF renamed this event from RFC Meeting: <topic TBD> (<see "Starts" field>, #wikimedia-office) to RFC Meeting: Support language variants in the REST API (2016-04-27, #wikimedia-office).Apr 21 2016, 11:22 PM
RobLa-WMF updated the event description. (Show Details)
RobLa-WMF updated the event description. (Show Details)Apr 27 2016, 1:42 AM

3:03 PM <wm-labs-meetbot> Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-04-27-21.00.html
3:03 PM <wm-labs-meetbot> Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-04-27-21.00.txt
3:03 PM <wm-labs-meetbot> Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-04-27-21.00.wiki
3:03 PM <wm-labs-meetbot> Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-04-27-21.00.log.html

Meeting summary

  • RFC: Support language variants in the REST API | Wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ (TimStarling, 21:00:45)
    • LINK: https://en.wikipedia.org/api/rest_v1/?doc (gwicke, 21:05:35)
    • LINK: https://en.wikipedia.org/api/rest_v1/?doc current REST API documentation (robla, 21:05:54)
    • question discussed "do the namespace of variants overlap with the namespace of languages?" (robla, 21:18:56)
    • TimStarling and brion propose rejecting the "adding domains" option (robla, 21:20:03)
    • question discussed: should we use the "option 3" syntax proposed in the meeting? (robla, 21:29:02)
    • LINK: https://phabricator.wikimedia.org/T114662 describes some of the use cases (cscott, 21:42:33)
    • ACTION: gwicke to update T122942 to summarise the options discussed here and remove the rejected option (TimStarling, 22:02:26)
    • ACTION: DanielK_WMDE to write an RFC discussing the philosophical nature of language (TimStarling, 22:03:25)

Meeting ended at 22:03:58 UTC.

Action items, by person

  • DanielK_WMDE
    • DanielK_WMDE to write an RFC discussing the philosophical nature of language
  • gwicke
    • gwicke to update T122942 to summarise the options discussed here and remove the rejected option

People present (lines said)

  • cscott (81)
  • gwicke (73)
  • brion (69)
  • DanielK_WMDE (51)
  • TimStarling (48)
  • robla (13)
  • subbu (12)
  • stashbot (11)
  • SMalyshev (9)
  • Scott_WUaS (4)
  • wm-labs-meetbot (3)
  • Zppix (1)

Full minutes:

1​21:00:24 <TimStarling> #startmeeting RFC meeting
2​21:00:24 <wm-labs-meetbot> Meeting started Wed Apr 27 21:00:24 2016 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot.
3​21:00:24 <wm-labs-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
4​21:00:24 <wm-labs-meetbot> The meeting name has been set to 'rfc_meeting'
5​21:00:35 <cscott> ah, i was wondering if I was in the right place
6​21:00:45 <TimStarling> #topic RFC: Support language variants in the REST API | Wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
7​21:01:21 <gwicke> hi
8​21:01:32 <Zppix> hello
9​21:01:49 <robla> o/
10​21:01:49 <Scott_WUaS> Hello
11​21:02:05 <gwicke> so, we have been thinking about the best way of exposing different languages and their variants in the REST API
12​21:02:07 <Scott_WUaS> Congratulations, Rob re Architecture Committee!!!
13​21:03:10 <gwicke> while the focus is on the REST API (which brings some requirements and constraints like caching), it is closely related to the bigger question of how we represent different language selections in URLs / requests
14​21:03:35 <robla> phab meeting: https://phabricator.wikimedia.org/E168 rfc task: T122942
15​21:03:35 <stashbot> T122942: RFC: Support language variants in the REST API - https://phabricator.wikimedia.org/T122942
16​21:03:49 <gwicke> DanielK_WMDE__ has started a related non-API discussion at https://phabricator.wikimedia.org/T114662
17​21:04:42 <gwicke> in that thread, one of the key questions that emerged was about the desired granularity of language selection
18​21:04:52 <cscott> i think we can come up with a reasonable consensus for the REST API. I don't know about settling any broader questions.
19​21:05:09 <gwicke> purodha summarizes this question well in https://phabricator.wikimedia.org/T114662#2005122
20​21:05:14 <TimStarling> can we have a link to the current REST API documentation?
21​21:05:33 <brion> to clarify, this rfc settles the URL interface for specifying when you're pulling a particular variant, which then opens the further separate question of how to implement the conversion in parsoid etc. correct?
22​21:05:35 <gwicke> https://en.wikipedia.org/api/rest_v1/?doc
23​21:05:54 <robla> #link https://en.wikipedia.org/api/rest_v1/?doc current REST API documentation
24​21:05:56 <gwicke> this RFC is about selecting content languages in the REST API
25​21:06:50 <cscott> T114662 is about URLs for mediawiki articles; I'm not sure that's strictly related. that is, we can pick some solution for the REST API w/o changing how we do article URLs, or vice-versa.
26​21:06:50 <stashbot> T114662: RFC: Per-language URLs for multilingual wiki pages - https://phabricator.wikimedia.org/T114662
27​21:07:04 <robla> I made an attempt to enumerate the options under consideration in T122942 . cscott tried to narrow it down
28​21:07:04 <stashbot> T122942: RFC: Support language variants in the REST API - https://phabricator.wikimedia.org/T122942
29​21:07:12 <gwicke> another aspect related to the granularity is whether we should expose whether something is a variant / auto-translated vs. a separate project
30​21:07:57 <cscott> is that really a question?
31​21:08:01 <gwicke> I personally think that we should have a consistent plan for selecting languages, and an idea for the granularity we are shooting for
32​21:08:35 <cscott> i think projects and variants are distinct. I don't see any reason to conflate them, or to obscure which project is responsible for a given bit of content.
33​21:08:42 <brion> to clarify further: we already have domains for projects, correct?
34​21:09:04 <cscott> brion: arguably too many, if I understand ops correctly.
35​21:09:08 <brion> :)
36​21:09:11 <gwicke> yes, we have domains for projects, but some of the projects are actually variants
37​21:09:14 <TimStarling> I agree with cscott
38​21:09:15 <gwicke> like zh-yue
39​21:09:22 <brion> just wondering how one would add a second layer of subdomains without ops killing us
40​21:09:23 <cscott> gwicke: that is not correct.
41​21:09:35 <gwicke> zh-yue is a variant of zh
42​21:09:43 <SMalyshev> but some projects have multiple languages?
43​21:09:43 <gwicke> in the language sense
44​21:09:44 <TimStarling> yeah yeah
45​21:09:49 <TimStarling> and italian is a variant of latin
46​21:09:51 <cscott> cantonese is a distinct language. the cantonese wiki is a distinct project.
47​21:10:17 <cscott> importantly, the decisions about which languages (or variants) get their own projects is not a technical matter, it's an issue decided by the community.
48​21:10:26 <TimStarling> but it's not really about linguistics, like cscott says, we have a very clear user-facing concept of a wiki and there's no sense conflating it with automatic translation
49​21:10:50 <TimStarling> trying to do so would potentially cause conflicts
50​21:10:55 <brion> secondarily we have commons, meta, mediawiki.org, etc which may carry pages of multiple different languages, have templates that render differently in different lanugages etc
51​21:11:05 <TimStarling> it would be quite possible to have a zh-yue variant of zh, and simultaneously a zh-yue wiki
52​21:11:20 <brion> i would tend towards a solution that treats language variants and alternate language renderings of the same page on the same project similarly
53​21:11:31 <cscott> brion: yes, that's more what T114662 is discussing. i'm not convinced it is best to discuss that at the same time as T112942.
54​21:11:31 <stashbot> T112942: [Regression] PHP version check broken in load.php and api.php - https://phabricator.wikimedia.org/T112942
55​21:11:31 <stashbot> T114662: RFC: Per-language URLs for multilingual wiki pages - https://phabricator.wikimedia.org/T114662
56​21:11:35 <gwicke> cscott brought up the prospect of conflicts, but I think it's not clear to me that we actually want to have many different ways of accessing content in a given language variant
57​21:11:36 <brion> while treating "project id" as totally separate
58​21:11:47 <brion> cscott: how would they differ?
59​21:12:23 <SMalyshev> I'd say we should treat zh-yue variant on zh wiki the same way as we treat zh language on commons - one way of many of representing same wiki's data
60​21:12:31 <cscott> brion: the commons issue introduces the notion of "interface language" as concept distinct from "source language" or "rendered variant".
61​21:13:07 <DanielK_WMDE> brion: +1
62​21:13:08 <cscott> the language-neutral parts of commons or metawiki is presented in an "interface language". the actual content has a "source language" which may (or may not) be rendered into a particular variant.
63​21:13:19 <brion> hmm
64​21:13:31 <DanielK_WMDE> brion: confating project ids with language ids causes no end of pain for wikidata & co.
65​21:13:34 <cscott> templates are part of the "interface", and so separating "content" from "interface" is not always straightforward.
66​21:13:35 <DanielK_WMDE> it's a *bad* idea
67​21:13:45 <cscott> it's a very interesting discussion, i just don't think it's strictly related to the REST API discussion.
68​21:13:47 <SMalyshev> cscott: it's not only interface, as GUI - the data itself can have different language representations
69​21:13:58 <brion> are template localizations selected against $wgUserLang or something else?
70​21:13:59 <cscott> SMalyshev: right. wikidata has that issue as well.
71​21:14:15 <gwicke> you are discussing the granularity bit
72​21:14:18 <brion> i think it's related insofar as if they're not the same thing they're almost identical and need to be treated similarly
73​21:14:24 <DanielK_WMDE> some wikidata api meodules allow language filters to be specified
74​21:14:25 <cscott> brion: i don't think that was quite decided. we discussed that at the last dev summit, w/o reaching consensus.
75​21:14:47 <DanielK_WMDE> but i don't see a good way to generalize this. the semantics and specifics really depend on the module
76​21:14:51 <brion> so localized templates today are done based on user language, so far as i know, as there is no other mechanism yet
77​21:15:11 <DanielK_WMDE> in some cases, you have a target language, in others, you specify a fallback chain. in some cases, the languages act as a filter, in others they trigger translitteration
78​21:15:21 <cscott> T114640 is also related to the interface language question.
79​21:15:21 <stashbot> T114640: RFC: make Parser::getTargetLanguage aware of multilingual wikis - https://phabricator.wikimedia.org/T114640
80​21:15:50 <cscott> and DanielK_WMDE has been the lead on those issues (just establishing context for others new to the discussion)
81​21:15:50 <DanielK_WMDE> cscott: btw, the api also has an interface message, for error messages.
82​21:15:56 <gwicke> generally, the REST API is exposing content in a given language, and optimizes for cacheability
83​21:15:58 <brion> so it sounds like we have two distinct language settings: content language and UI language, each of which may have a variant
84​21:16:11 <brion> *and* the namespace of variants overlaps with the namespace of languages potentially
85​21:16:19 <brion> can we verify that last point as true/false?
86​21:16:28 <cscott> brion: strictly speaking, you also have various fallbacks based on logged-in user preferences as well.
87​21:16:41 <cscott> brion: consider an en-gb variant on enwiki and simplewiki.
88​21:16:48 <gwicke> yes, variants can be either auto-translated, or separate projects as with zh-yue
89​21:17:11 <brion> cscott: right, so en-gb may be either a standalone language or a variant of en
90​21:17:12 <cscott> we *could* rename things as necessary to ensure projects and variants never overlap, but that's never been necessary before.
91​21:17:20 <subbu> brion, ah interesting reg. content and ui language and each of them having variants ... i hadn't realized that additional complexity. so, ui language only affects ui messages and content language affects content represented in wikitext?
92​21:17:23 <TimStarling> is there anyone other than gwicke who supports option #1?
93​21:17:33 <TimStarling> everyone else seems to have spoken against it
94​21:18:03 <brion> ui language affects any wikitext content that varies based on {{#userlang}} (is that the right function name?) or whatever equivalent lua magic, i suppose
95​21:18:24 <cscott> subbu: it should be said that "ui language" is partially a fiction at this point. that is, it exists in our minds but doesn't have a clear expression in mediawiki code... yet.
96​21:18:24 <TimStarling> I would like to propose that we reject option #1 and move on to discussing the other options
97​21:18:29 <gwicke> TimStarling, I think you are a bit premature with your question
98​21:18:52 <cscott> T114640 and T114662 are attempts to codify "ui language" in the codebase (among other things)
99​21:18:53 <stashbot> T114662: RFC: Per-language URLs for multilingual wiki pages - https://phabricator.wikimedia.org/T114662
100​21:18:53 <stashbot> T114640: RFC: make Parser::getTargetLanguage aware of multilingual wikis - https://phabricator.wikimedia.org/T114640
101​21:18:56 <robla> #info question discussed "do the namespace of variants overlap with the namespace of languages?"
102​21:18:58 <brion> i would reject option 1 (adding domains)
103​21:19:31 <brion> domains should map directly to high-level projects, which are separate and distinct places you can interact with
104​21:19:37 <brion> which may, or may not, have anything to do with languages
105​21:19:42 <cscott> brion: +1
106​21:19:54 <brion> eg meta and commons are not languages :)
107​21:20:03 <gwicke> those are different levels of the domain
108​21:20:03 <robla> #info TimStarling and brion propose rejecting the "adding domains" option
109​21:20:33 <brion> wikisource.org is one site (multilingual), de.wikisource.org is another (which happens to be centered on one content language)
110​21:20:40 <robla> I attempted to clarify what the 4 options under consideration are here: https://phabricator.wikimedia.org/T122942#2244988
111​21:21:34 <cscott> can i suggest renarrowing the discussion to the REST APIs? Or do we think that it's worthwhile to discuss DanielK_WMDE's more general questions about article URL paths? (T114*)
112​21:21:34 <stashbot> T114: The order of tasks in Phabricator Boards doesn't always save - https://phabricator.wikimedia.org/T114
113​21:21:37 <SMalyshev> I agree, I think domain should be project,. for some projects it defines language, but for others it doens't so putting it there will only confuse matters
114​21:21:49 <cscott> ah, i was trying to shut up stashbot by using the wildcard. didn't work...
115​21:22:13 <TimStarling> ok, so can we discuss option #2 versus option #3?
116​21:22:31 <DanielK_WMDE> cscott: we should at least answer the question why the two should be different.
117​21:22:38 <subbu> in https://phabricator.wikimedia.org/T122942#2144512 .. bianjiang proposes http headers in case that is a candidate worth considering ...
118​21:22:38 <brion> cscott: article url paths are distinct from rest api urls, but language selection for on-wiki translations seems roughly identical to this problem in scope and rules and should probably be treated together
119​21:23:00 <DanielK_WMDE> or whether they should follow the same pattern. having two different solutions to the same probloem isn't nice. if it is the same problem.
120​21:23:03 <DanielK_WMDE> that's the question
121​21:23:32 <cscott> DanielK_WMDE: the API paths in https://en.wikipedia.org/api/rest_v1/?doc don't look (to me) anything like article paths.
122​21:23:35 <gwicke> fwiw, we are using domains heavily internally in the REST API, and will very likely continue to use them for variants as well
123​21:23:51 <gwicke> the question is whether this should be hidden from view (only internal), or exposed
124​21:24:01 <cscott> domains specify the project aka database
125​21:24:10 <TimStarling> I don't think you should use domains internally for variants
126​21:24:16 <TimStarling> you could always fix that
127​21:24:24 <DanielK_WMDE> cscott: no, but if we decide to use subdomains for content variants, we probably should do the same for the api, no?
128​21:24:34 <gwicke> and whether we really want both https://zh.wikipedia.org/zh-yue/ * and https://zh-yue.wikipedia.org/zh-yue/*
129​21:24:53 * DanielK_WMDE does not think we should have variants in the subdomains
130​21:25:05 <gwicke> TimStarling, domains are unique ids for projects
131​21:25:06 <cscott> DanielK_WMDE: not sure. the article paths have all sorts of human-friendly features, like if you're logged in and hit the generic /wiki/{title} path you get content according to your user preferences.
132​21:25:36 <cscott> DanielK_WMDE: that should probably be a redirect or something eventually to preserve cacheability. but the point is that article URLs are meant for people to consume. the REST URLs are not.
133​21:25:39 <DanielK_WMDE> gwicke: to me, they mean different things: the first one is denotes a variant transformation, the second one separate content.
134​21:26:05 <TimStarling> option 2 would be something like /en-gb/page/html/Australia , right?
135​21:26:08 <cscott> gwicke: i don't think you can decide "whether we really want both https://zh.wikipedia.org/zh-yue/ * and https://zh-yue.wikipedia.org/zh-yue/*". that's a matter for the communities of those wikis to decide.
136​21:26:19 <DanielK_WMDE> cscott: true, but why not follow the same patterns, and use the same mechanims?
137​21:26:22 <brion> gwicke: if they exist, they exist
138​21:26:28 <gwicke> cscott, it is also a product question for wmf
139​21:26:32 <DanielK_WMDE> cscott: note that API responses *are* specific to the user language
140​21:26:42 <TimStarling> and option 3 something like /page/variant/en-gb/html/Australia ?
141​21:26:43 <cscott> gwicke: it is not a question that can be settled in an RFC meeting.
142​21:26:57 <gwicke> DanielK_WMDE: no, api responses are based on the wiki's content language
143​21:27:08 <TimStarling> i.e. with option 3 you avoid mixing language codes with API endpoint identifiers
144​21:27:20 <TimStarling> so you can have an Apiaka language or whatever
145​21:27:26 <TimStarling> which seems elegant to me
146​21:27:34 <DanielK_WMDE> gwicke: they supprt uselang. and i think it defaults to the user's ui language - but i could be wrong
147​21:27:45 <DanielK_WMDE> gwicke: i think it's only used for error messages, but still
148​21:27:49 <gwicke> DanielK_WMDE, the REST API does not support uselang
149​21:27:50 <cscott> i think your option 3 syntax is fine.
150​21:28:10 <gwicke> TimStarling, so the proposal is /zh-yue/api/rest_v1/...?
151​21:28:24 <DanielK_WMDE> gwicke: so, no localized error messages?
152​21:28:25 <gwicke> or /api/rest_v1/variant/zh-yue/...?
153​21:28:30 <gwicke> DanielK_WMDE, nope
154​21:28:38 <DanielK_WMDE> shame ;)
155​21:28:40 * gwicke shudders
156​21:28:40 <TimStarling> the second one
157​21:28:53 <cscott> /api/rest_v1/page/variant/en-gb for option 3. i'm not sure what the full path for option 2 is.
158​21:28:54 <TimStarling> /variant introduces the language code
159​21:29:02 <robla> #info question discussed: should we use the "option 3" syntax proposed in the meeting?
160​21:29:15 <DanielK_WMDE> i don't really like the "variant" but. in my mind, you specify the desired target language
161​21:29:31 <DanielK_WMDE> if the desired target language is a variant of the actual content language, we can transform the content
162​21:29:39 <SMalyshev> this would be solution only for variants, not multilingual content, right?
163​21:29:41 <TimStarling> but like cscott says, and like my example just now, I'm not saying it should be a prefix for the whole REST API like /api/rest_v1/variant/zh-yue
164​21:29:42 <DanielK_WMDE> if it's something else, well, then we can't transform
165​21:29:52 <cscott> instead of "variant" maybe "langconvert"? LanguageConverter is the name of the code which is doing the transformation.
166​21:29:56 <TimStarling> I think it could be under /page
167​21:30:08 <brion> ok, so on zh.wikipedia.org i can confirm I can both set my ui language *and* pick a content variant
168​21:30:08 <DanielK_WMDE> how about just "language" instead of "variant"?
169​21:30:22 <DanielK_WMDE> if the original content is multilingual or language neutral, all kinds of target languages could be supported
170​21:30:26 <DanielK_WMDE> think commons or wikidata
171​21:30:34 <gwicke> ui language is a separate concept, and largely irrelevant for the REST API
172​21:30:46 <gwicke> generally, the REST API is aimed at client-side UI composition
173​21:30:52 <cscott> To take DanielK_WMDE's side for a moment -- what if you want to specify user interface language, not just a variant conversion.
174​21:30:58 <brion> gwicke: rest api can serve HTML of rendered pages correct? if so, it must known UI language to pass it thorugh for rendering of templates
175​21:31:02 <gwicke> which means that clients can use whatever UI language they like, but consume data in another language
176​21:31:07 <brion> or else we need an alternate way to handle translatable templates
177​21:31:12 <TimStarling> ideally a REST API would have HATEOAS-style hyperlinks, right?
178​21:31:19 <cscott> unfortunately, the REST API *does* have UI stuff, in so far as there are templates on commons/etc which are part of the UX, not part of the "content" per se.
179​21:31:19 <DanielK_WMDE> SMalyshev: i really want variants and multilang content to work the same. translated content is different.
180​21:31:37 <TimStarling> so you should be able to get a listing of available variants with /variant/
181​21:31:46 <DanielK_WMDE> cscott: the target language isn't hte "ui" language. is the desired language for the content.
182​21:31:48 <SMalyshev> DanielK_WMDE: me too, but the proposed option 3 wouldn't do that, iiuc
183​21:32:15 <cscott> DanielK_WMDE/SMalyshev: right now I think we're agreed that translated content is handled as an article suffix, right? Foo is the article, Foo/en-gb is the translation.
184​21:32:25 <brion> TimStarling: if we have per-page content rev, need to make sure we can ask for list of variants per-page right?
185​21:32:40 <gwicke> cscott, that is already taken, so would break existing apis
186​21:32:51 <SMalyshev> cscott: it's not translation. I.e. commons description can be in English and German, they are not translations
187​21:33:06 <TimStarling> brion: yes, I suppose so
188​21:33:17 <SMalyshev> cscott: if you had API that gets commons image data, including description, wouldn't you want to specify which language description you want?
189​21:33:26 <cscott> DanielK_WMDE: you can be viewing zhwiki in the simplified variant, yet have your UX language set to english or german. the templates in File:* (in theory) should respect your UX language. as far as I understand it.
190​21:33:47 <subbu> I am getting confused by this discussion .. I thought the proposal that seemed that might work was "/page/variant/en-gb/html/Australia". Am I mistaken?
191​21:33:59 <gwicke> cscott, UI language is mostly irrelevant
192​21:34:03 <TimStarling> but having subpaths of articles is a bit awkward when they can contain slashes in the titles
193​21:34:25 <gwicke> TimStarling, as you know, slashes are encoded
194​21:34:26 <TimStarling> you would have to encode them as %2F
195​21:34:33 <cscott> subbu: i think the discussion on the REST api is pretty solid, Tim's suggestion seems good. but DanielK_WMDE wants to use a consistent mechanism for article URLs as well, and that's a harder problem.
196​21:34:55 <gwicke> but, as I said, the suffix path is already in use (for revision selection)
197​21:35:14 <DanielK_WMDE> cscott: yes, translated content uses suffixes. independent content uses subdomains. but the "target language" for multilang content should be as the "variant" for transformable trext
198​21:35:22 <cscott> i'm sympathetic to DanielK_WMDE's desire, of course. i think it's an interesting question. but it does make the structure of this meeting somewhat challenging. ;)
199​21:35:33 <cscott> gwicke: Foo%2Fen-gb
200​21:35:37 <TimStarling> ok, so if you have a /variant suffix then that can achieve brion's goal of listing variants on a per-page basis
201​21:36:11 <gwicke> the only thing I can see working so far is something like /api/rest_v1/page/variant/{something}/...
202​21:36:13 <brion> i'm not a big fan of overloading suffixes, but at least it's 100% distinguishable from revision numbers
203​21:36:25 <TimStarling> /page/Australia/variant/ could give a list of variants
204​21:36:27 <cscott> DanielK_WMDE: so in my example for zhwiki, how do you solve that problem? you can't localize text in a variant if your UX language is different?
205​21:36:43 <DanielK_WMDE> gwicke: i'm good with that if we replace "variant" with "language"
206​21:36:51 <TimStarling> /page/Australia/variant/en-gb/html could give the HTML in the en-gb variant
207​21:37:11 <brion> cscott, DanielK_WMDE: i tend to agree it'd be ideal to merge variant and language but i may be wildly incorrect ;)
208​21:37:14 <gwicke> ../page/html/Australia/12345 is already that revision of Australia
209​21:37:16 <brion> *UI language
210​21:37:34 <gwicke> and ../page/html/Australia/12345/ lists renders of that revision
211​21:37:35 <brion> gwicke: /\d+/ does not match "variant"
212​21:37:40 <cscott> brion: i'm just not sure how to actually render a zhwiki page if you set "en-gb" as your language.
213​21:37:40 <DanielK_WMDE> cscott: not if the target language is defined to be the UI language. this is the case for wikidata. for zhwiki, the target language is taken from the url path, so no problem, right?
214​21:37:49 <brion> it's very easy to distinguish those, though you may not wish to ;)
215​21:37:50 <TimStarling> gwicke: that's why you always need a keyword in the path, for extensibility
216​21:37:53 <gwicke> brion, that's.. ewww...
217​21:38:02 <brion> gwicke: yeah :)
218​21:38:11 <brion> there's a conflict between positional parameters and named parameters here
219​21:38:13 <DanielK_WMDE> cscott: for commons and wikidata, the target language should probably always be the user's ui language. but for zhwiki and co, perhaps it shouldn't. not sure
220​21:38:24 <brion> you can always add positional parameters but urls get ugly when there's a million empty ones
221​21:38:35 <brion> and named parameters in url path part pairs feel weird
222​21:38:48 <cscott> DanielK_WMDE: unfortunately, if you don't run languageconverter for *some* specific variant, you get text which is a mishmash of character sets which basically no one can read.
223​21:39:04 <TimStarling> it's not strictly named parameters, it's still hierarchical, there's a defined order
224​21:39:07 <brion> cscott: sure, in that case run to some reasonable default .... oh shit politics ;)
225​21:39:16 <cscott> DanielK_WMDE: hence my feeling that it's best to separate the "pick a variant" part from the "ux language" part.
226​21:39:26 <cscott> brion: yeah.
227​21:39:42 <DanielK_WMDE> brion: variant and target language are handled in the same place internally: Content::getParserOutput gets Content that is in language X (or multilang) and is asked for output in language Y. if Y is a variant of X, a transformation can be applied.
228​21:39:42 <brion> wow this is a way more controversial topic than i expected
229​21:39:55 <cscott> i mean, i could be convinced that we can just pick some behavior arbitrarily and this is a corner case and it won't matter in the end. i just haven't quite been convinced of that yet.
230​21:40:13 <subbu> brion, as far as i know this has always been a controversial topic.
231​21:40:17 <DanielK_WMDE> cscott: my point is that it's "pick a target language", not "pick a variant". the target language may or may not be tied to the ui language.
232​21:40:32 <gwicke> there's only two hard problems in computer science..
233​21:40:38 <TimStarling> for user language you can have /variant/en-gb/userlang/en-au
234​21:40:38 <subbu> brion, sorry misinterpreted .. you said: "way more" ..
235​21:40:44 <brion> DanielK_WMDE: i tend to like that model, but agree that we may not know what importance of corner cases will be
236​21:40:45 <cscott> DanielK_WMDE: yes, but isn't the point of the T114* bugs to try to separate those languages internally?
237​21:40:45 <stashbot> T114: The order of tasks in Phabricator Boards doesn't always save - https://phabricator.wikimedia.org/T114
238​21:40:54 <DanielK_WMDE> cscott: when viewing commons content, you want to specify the output language. that's not a variant. and it might be different from your ui language (though i find that a bit pointless)
239​21:40:58 <TimStarling> but it's hierarchical, it's not key-value, you can't have /userlang/en-au/variant/en-gb, it's not in the schema
240​21:41:03 <brion> (subbu: url structure for apis is usually boring stuff)
241​21:41:14 * subbu nods
242​21:41:34 <brion> the actual details of the converter yeah :DD
243​21:41:52 <gwicke> what is the use case for this userlang stuff?
244​21:41:53 <DanielK_WMDE> cscott: the point is to internally have a clear notion of the (stored) content language, the desired target language, and the effective output language.
245​21:42:00 <DanielK_WMDE> ...and the UI language
246​21:42:15 <cscott> gwicke: labels for commons and wikidata metadata, like field labels, etc.
247​21:42:20 <gwicke> remember that this is an API exposing data
248​21:42:22 <gwicke> not UX
249​21:42:25 <DanielK_WMDE> four languages instead of two-plus-odd-bits
250​21:42:33 <cscott> https://phabricator.wikimedia.org/T114662 describes some of the use cases
251​21:43:00 <DanielK_WMDE> gwicke: i'm not sure, i'm talking about a target language. i don't see how the user language playes into this.,
252​21:43:50 <gwicke> in MW terms, what we are interested here is the *content language*
253​21:43:54 <DanielK_WMDE> cscott: in wikidata, we would tie the target language to the UI language. but the api shouldn't know or care, and it could be different on other projects
254​21:44:00 <brion> the main reason to specify both would be to say 'i'm viewing in language X but need to look at content for language Y'... but i think in a world where UI is more separate from content things may change a bit in the semantics
255​21:44:28 <brion> eg is it ok for the template that links to translations to *not* be translated in french when i look at https://www.mediawiki.org/wiki/Manual:Extension_registration?uselang=fr ?
256​21:44:52 <brion> currently https://www.mediawiki.org/wiki/Manual:Extension_registration english and https://www.mediawiki.org/wiki/Manual:Extension_registration/fr french pages are distinct, but the template at the top localizes to whatever my uselang is
257​21:44:53 <cscott> gwicke: again, the problem is that some of our "content" contains "interface" elements. it sucks, but that's how it is.
258​21:45:06 <brion> is the template content? or is it meta-ui?
259​21:45:16 <gwicke> templates are content as far as I am concerned
260​21:45:21 <brion> even if we remove crap like labeling the "Table of contents" or "edit links" we still have those
261​21:45:22 <DanielK_WMDE> brion: yes, that's the question of when and how the target language should be tied to the ui language. it's an interresting one, but not one we need to answere in the context of todays rfc, i think
262​21:45:46 <brion> DanielK_WMDE: my concern is just that if we add "/variant" on the end do we have to scramble next week to add "/uselang" ?
263​21:45:55 <TimStarling> DanielK_WMDE: right, it doesn't need to be answered, and really a lot of your comments have been a distraction
264​21:46:01 <brion> hehe
265​21:46:04 <cscott> The {{int}} template/parser function is also interesting.
266​21:46:16 <brion> if we think it's ok to treat those at different times, then i withdraw much of my conversation for now :)
267​21:46:18 <TimStarling> what we need is to answer gwicke's actual implementation problem in a way that is reasonably forwards-compatible
268​21:46:39 <TimStarling> and we can discuss all the things we can do with that forwards-compatibility some other day
269​21:46:39 <cscott> i still like /page/variant/{foo}
270​21:46:56 <DanielK_WMDE> TimStarling: i'm sorry to hear that. all i want is really to not call it a variant, but a target language, and think in these terms. no further derailment intended
271​21:47:08 <gwicke> cscott, I think so far that's the only proposal that would not break existing apis
272​21:47:09 <cscott> sorry, /page/langconvert/{foo}
273​21:47:22 <cscott> that will be specific to "invoke the language converter as apost processor"
274​21:47:27 <gwicke> (apart from domains, which everybody seems to dislike)
275​21:47:32 <brion> does langconvert return html same as /page/html/{foo}?
276​21:47:33 <cscott> we can figure out some cool way to unify these later, maybe.
277​21:48:00 <cscott> brion: yeah, sorry. it should be like tim wrote it. /page/langconvert/en-gb/html/...
278​21:48:07 <gwicke> brion, it would be a mirror of the page hierarchy
279​21:48:21 <brion> hmm, that sounds ok for that
280​21:48:23 <cscott> or /page/langconvert/en-gb/page/html/... even.
281​21:48:24 <gwicke> so /api/rest_v1/page/variant/zh-yue/html/Foo
282​21:48:36 <brion> but if we add a second option, how do we reconcile the two tree prefixes?
283​21:49:19 <gwicke> a second option for language selection?
284​21:49:28 <brion> or is it safe to in future extend semantics of /page/variant/zh-yue/html/Foo to support /page/variant/fr/html/Foo ?
285​21:49:29 <cscott> brion: best case: I take everything after /langconvert/{code} and pass it back into REST, and do the language conversion on the output.
286​21:49:31 <gwicke> are you thinking about regions?
287​21:49:35 <brion> gwicke: for target language that isn't a variant
288​21:49:52 <cscott> so if /page/coolness/ is ever a thing, then /page/langconvert/en-gb/coolness/... will Just Work.
289​21:49:55 <gwicke> does it matter whether it's a variant?
290​21:50:03 * DanielK_WMDE is good with /page/langconvert/{foo}
291​21:50:08 <subbu> .. /langconvert/<content_lang>:<ui_lang>/ if ever that ui_lang needs to be added? otherwise /langconvert/<content_lang>/ works?
292​21:50:33 <subbu> .. /page/langconvert/... i mean
293​21:50:35 <cscott> subbu: ui_lang is actually part of template expansion, not language conversion. sadly.
294​21:50:36 <DanielK_WMDE> gwicke: what do you mean by "language selection" exactly?
295​21:50:42 <gwicke> zh.wikipedia.org/api/rest_v1/page/lang/en-gb/html/Foo
296​21:50:47 <cscott> ie, it influences how the {{int
297​21:50:53 <cscott> }} template is expanded.
298​21:51:04 <subbu> oye.
299​21:51:05 <gwicke> DanielK_WMDE, select the content language
300​21:51:19 <brion> langconvert feels like a very specialized filter, like mobile-text
301​21:51:33 <cscott> so it would be /page/langconvert/en-gb/ui_lang/de/html/ArticleTitle, in one version of the future.
302​21:51:39 <DanielK_WMDE> gwicke: does that select where the content is loaded from? i.e. the project?
303​21:51:40 <gwicke> one thing I'm concerned about with schemes like this is what it does to the API documentation
304​21:51:57 <gwicke> it will basically duplicate the bulk of the API docs in a second hierarchy
305​21:52:13 <TimStarling> I would be happy to approve a range of possible path-based schemes at this point
306​21:52:22 <cscott> gwicke: some of the api endpoints shouldn't be necessary for /langconvert/
307​21:52:26 <TimStarling> with the exact scheme at the discretion of the implementor
308​21:52:34 <cscott> ie, listing revisions. that can be done on the main /page endpoint.
309​21:53:04 <brion> cscott: revision comments need to be run through the converter don't they?
310​21:53:07 <gwicke> yeah, which makes it even more subtle
311​21:53:16 <cscott> i'd like to suggest that we discuss DanielK_WMDE's general language questions in a follow-up meeting, not too long from now.
312​21:53:33 <cscott> brion: those come from the action api, not from rest.
313​21:53:42 <robla> it sounds like there's a tradeoff between cachable URL schemes and ease of documentation with tools like Swagger
314​21:53:46 <brion> ugh
315​21:53:56 <cscott> brion: and parsoid doesn't really implement "revision comment" parsing, which differs from normal parsing in a bunch of obscure and painful ways.
316​21:53:59 <TimStarling> I don't want to bikeshed, I just want it to be done
317​21:54:20 <Scott_WUaS> cscott: sounds good - i'd like to suggest that we discuss DanielK_WMDE's general language questions
318​21:54:31 <gwicke> TimStarling, the reason we wrote this RFC is that we want to do this consistently with the general strategy of language selection
319​21:54:34 <gwicke> so lets not rush it
320​21:54:59 * subbu is happy with path-based schemes
321​21:55:27 <robla> I'm happy to help someone (cscott?) to come up with a concise list of open questions for this RFC
322​21:55:34 <cscott> well, i think that variant conversion is currently "next" on my plate, after balanced templates. but it will still be a while before any patch i write is actually ready to be deployed into production.
323​21:55:34 <gwicke> it wouldn't make sense to have several different path-based ways of selecting language variants, for example
324​21:56:02 <cscott> robla: i think we've got a reasonable consensus on an interim solution, but concern over the more general questions of DanielK_WMDE is preventing us from finalizing anything.
325​21:56:07 <cscott> (which i actually agree with)
326​21:56:29 <cscott> so i think the way to make further progress here is to actually grapple with the more general url scheme question, then return here and see if the solution to that problem bears on this one.
327​21:57:01 <gwicke> what we are looking for is basically option 2
328​21:57:08 <DanielK_WMDE> i don't want to derail or stonewall this or related rfcs.
329​21:57:18 <gwicke> a uniform path-based way of selecting language variants
330​21:57:21 <brion> are we otherwise happy with the notion of zh.wikipedia.org/api/rest_v1/page/lang/zh-hant/html/Foo with the open question of whether zh-hant can be replaced with en/fr/etc in a way that will be consistent?
331​21:57:29 <cscott> well, we're not holding anything up until i've actually got a patch in hand. which i don't yet.
332​21:57:43 <DanielK_WMDE> i just want to make sure we have a good concept of how we handle languages in general
333​21:57:54 <brion> or do we need to ponder more before committing to that model?
334​21:57:57 <subbu> brion, i think DanielK_WMDE preferred /langconvert/ over /lang/ i think unless i misunderstood it.
335​21:58:12 <brion> ist egal zu mir, as the germans say :D
336​21:58:15 <SMalyshev> I think /page/lang/ would be the most neutral one without overfocusing the semantics
337​21:58:15 <brion> i'll take langconvert
338​21:58:23 <gwicke> the issue with /langconvert/ et al is that it's a one-off solution for the REST API
339​21:58:24 <DanielK_WMDE> subbu, brion: i'm good with /lang/. "convert" is an implementation detail.
340​21:58:30 <brion> though lang is happy yeah
341​21:58:41 * brion "take it to #wikimedia-bikeshed!" ;)
342​21:58:42 <gwicke> rather than someting that will work for articles as well
343​21:58:50 <DanielK_WMDE> subbu: i just don't want /variant/, because i think it's too narrow
344​21:58:50 <robla> DanielK_WMDE: cscott : is there an action item for DanielK_WMDE to write up a generalized RFC for URL policy?
345​21:59:06 <subbu> DanielK_WMDE, ok .. thanks for clarifying.
346​21:59:16 <cscott> gwicke: yeah, but a one off solution might be enough for now. it might turn out that the more general /page/html/lang/foo/balh solution internally dispatches to /page/langconvert/ to do the actual language conversion part.
347​21:59:26 <brion> TimStarling: what say you? we're coming up on time
348​21:59:36 <TimStarling> yes, fine
349​21:59:38 <DanielK_WMDE> robla: i could wrinte an rfc that is just about terms and concepts, not about code at all.
350​21:59:41 <cscott> so maybe /page/langconvert doesn't actually have to be a part of the public api in the end. but it's a useful narrow solution to the immeditate implementation issue.
351​21:59:44 <gwicke> cscott, that doesn't make sense
352​21:59:53 <gwicke> the url you propose is already in use
353​22:00:19 <cscott> i don't like /lang/ specifically because it's more general than i'm happy with right now. i'm not convinced that language converter and the other languages involved can be unified in the end.
354​22:00:25 <brion> agh did i mean /api/rest_v1/lang/zh-hant/page/html/Foo ?
355​22:00:32 <cscott> maybe they can be. but at the moment i'd like a narrow solution to a specific problem.
356​22:00:42 <TimStarling> time's up now
357​22:01:02 <gwicke> brion, it might make sense to shift it up one or more levels, yes
358​22:01:23 <brion> [it may be worth considering it a filter like /page/mobile-text/{title} that might go away some day in favor of a more general solution]
359​22:01:25 <gwicke> also in the running: /zh-yue/api/rest_v1/...
360​22:01:32 <gwicke> and /zh-yue/wiki/...
361​22:01:33 <TimStarling> who is going to update the RFC page? gwicke or cscott?
362​22:01:44 <robla> May 4 meeting: https://phabricator.wikimedia.org/E169 about PSR-6
363​22:01:49 <cscott> i think it's gwicke's RFC
364​22:02:04 <gwicke> we should perhaps update DanielK_WMDE's RFC as well
365​22:02:18 <cscott> brion: "[it may be worth considering it"... yes, that's what i'm suggesting.
366​22:02:24 <DanielK_WMDE> gwicke: in what way?
367​22:02:26 <TimStarling> #action gwicke to update T122942 to summarise the options discussed here and remove the rejected option
368​22:02:27 <stashbot> T122942: RFC: Support language variants in the REST API - https://phabricator.wikimedia.org/T122942
369​22:02:37 <brion> cscott: +1
370​22:03:25 <TimStarling> #action DanielK_WMDE to write an RFC discussing the philosophical nature of language
371​22:03:37 <robla> lol
372​22:03:44 <DanielK_WMDE> TimStarling: hehe ;)
373​22:03:49 <Scott_WUaS> :) +1
374​22:03:53 <brion> haha
375​22:03:58 <TimStarling> #endmeeting

daniel renamed this event from RFC Meeting: Support language variants in the REST API (2016-04-27, #wikimedia-office) to ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office).Nov 21 2016, 6:11 PM
daniel changed the host of this event from RobLa-WMF to daniel.
daniel invited: ; uninvited: .
daniel updated the event description. (Show Details)
ssastry renamed this event from ArchCom RFC Meeting Wxx: <topic TBD> (<see "Starts" field>, #wikimedia-office) to RFC Meeting: Support language variants in the REST API (2016-04-27, #wikimedia-office).Nov 30 2016, 4:50 PM
ssastry changed the start date for this event from Apr 27 2016, 9:00 PM to Apr 27 2016, 9:00 PM.
ssastry changed the end date for this event from Apr 27 2016, 10:00 PM to Apr 27 2016, 10:00 PM.
daniel updated the event description. (Show Details)Dec 9 2016, 7:43 AM