Page MenuHomePhabricator

Better varargs for templates
Open, Needs TriagePublic

Description

The current ad hoc mechanisms used to pass a variable number of arguments to a template/parser function/scribunto module don't play nicely with TemplateData (and thus with VisualEditor). Typically you see something like:

{{MyBigTemplate
|file1=....
|caption1=....
|file2=....
|caption2=....
}}

But then you have to localize all these parameter names (the file part) and ideally handle the fact that different languages use different numeral systems and potentially even need to conjugate the parameter name to agree with the quantity. (Different suffixes for 1, 2, or 3 or more of a kind....) TemplateData doesn't support this type of argument cleanly, and maybe shouldn't...

So what would be better?

Scarecrow #0: Let's just work with what the editors are already doing, and try to some way to describe this pattern in TemplateData, and good tools to localize the parameter names.

Scarecrow #1: var args arguments should really take JSON inputs, and we can use heredoc syntax (T114432) to make embedding JSON not-so-awful:

{{MyBigTemplate
|files=<<<[
{"name": "...", "caption":"..."},
{"name": "...", "caption":"..."},
{"name": "...", "caption":"..."},
{"name": "...", "caption":"..."}
]>>>}}

This would presumably look fine in VE when editing, and getting TemplateData to describe a typed JSON argument is probably not too hard, but it's a jarring syntax shift.

Scarecrow #1b: Keeping a fundamental "JSON array" semantics, we could use helper templates to make the syntax look nicer for humans:

{{MyBigTemplate
|files=<<<
{{item|name=...|caption=....}}
{{item|name=...|caption=<<<...>>>}}
>>>}}

Some hand-waving here to make sequences of {{item}} generate a JSON array with the proper head/tail.

Scarecrow #2: Something "wikitext-native", which reuses our existing parameter syntax.

{{MyBigTemplate
|files=
<<<name=... | caption=...>>>
<<<name=... | caption=...>>>
<<<name=... | caption=<<<
  really long caption
>>> >>>
}}

In this last option we'd take advantage of the fact that we don't have any special behavior defined for >>>\s*<<< and use that as a magic marker to indicate "pass this sequence of things as an array". The template author would have to be aware that they would either get a string as an argument (if there was only one item) or an array (if there was more than one).

I'm not too thrilled with any of these options. Do folks have better ideas?

Event Timeline

Amire80 subscribed.

+ Million for the general idea.

I've wondered for many years why isn't there a proper mechanism to pass a real dynamic array, without a predefined length. The current common way, with predefining an arbitrary number of parameters with names like author1, author2, author3, is quite atrocious.

I don't care very much about the actual implementation. Passing JSON can probably work, although it doesn't seem entirely optimal. For example, it still relies on everything being just strings, while I imagine future templates and modules having true and varied data types. A more dedicated syntax for an array could probably be nice.

and potentially even need to conjugate the parameter name to agree with the quantity. (Different suffixes for 1, 2, or 3 or more of a kind....)

I wonder whether that's actually true. In natural language languages you have such conjugations, but template parameter names aren't necessarily "natural language".

Scarecrow #1: var args arguments should really take JSON inputs,

Embedding JSON blobs in wikitext seems like a good idea to programmers like us, but I'm not so sure it would make sense to wiki editors who aren't already extremely familiar with JSON's quoting and such.

Some hand-waving here to make sequences of {{item}} generate a JSON array with the proper head/tail.

Indeed. The general idea is better than JSON for wikitext, why not lose the JSON entirely?

It wouldn't surprise me if someone already has done this somewhere, having the equivalent of the {{item}} template output a delimited record and parsing that record format in a module.

Scarecrow #2: Something "wikitext-native", which reuses our existing parameter syntax.

That doesn't really reuse our existing parameter syntax. And the nested "<<< >>>" I see in there is pretty ugly.

Indeed. The general idea is better than JSON for wikitext, why not lose the JSON entirely?

It wouldn't surprise me if someone already has done this somewhere, having the equivalent of the {{item}} template output a delimited record and parsing that record format in a module.

I agree wrt "why JSON"; I'd really like something better. But it does seem to be the de facto standard -- I've seen a number of modules accept a JSON parameter, or extensions accept a JSON body (ie, <mapframe> in Extension:Kartographer).

I wonder if we're over-thinking this. Perhaps all we need is to allow the same parameter name be repeated:

{{{{{MyBigTemplate
|file=....
|caption=....
|file=....
|caption=....
}}

And provide a new alternative to {{{file}}} that would allow accessing the values individually or as a group. Strawman:

{{{#|file}}} - the number of varargs
{{{#1|file}}} - the first vararg in file
{{{#2|file}}} - the second vararg in file
{{{#=|file}}} - all varargs - used like so:  {{subtemplate|arg|{{{#=|file}}}|{{{#=|comment}}}|other=argument}}

Conceptually, the |= modifier expands to file={{{#1|file}}}|file={{{#2|file}}}|...|file={{{#n|file}}} to pass all the varargs. (Presumably this would be a special case in the preprocessor and wouldn't actually need to be expanded out.)

Note that {{{file|1}} can't be used because that syntax specifies 1 as the default value if file is not present. And {{{1|file}}} specifies the first numbered parameter with a default value. And we've indexed varargs starting at 1 to be consistent with numbered arguments, which also start at {{{1}}}. But # isn't a valid numeric or named parameter, so it can be used as an unambiguous prefix for the new varargs functionality.

But # isn't a valid numeric or named parameter, so it can be used as an unambiguous prefix for the new varargs functionality.

That's incorrect, it's perfectly valid as a named parameter.

I wonder if we're over-thinking this. Perhaps all we need is to allow the same parameter name be repeated:

{{{{{MyBigTemplate
|file=....
|caption=....
|file=....
|caption=....
}}

A drawback to that syntax is that it doesn't allow you to skip a caption. For a silly example,

{{MyBigTemplate
|file=A.png
|caption=The letter 'A'
|file=B.png
|file=C.png
|caption=The letter 'C'
}}

would have {{{#2|file}}} (by whatever syntax) as "B.png" but {{{#2|caption}}} as "The letter 'C'".

But maybe that's ok?

But # isn't a valid numeric or named parameter, so it can be used as an unambiguous prefix for the new varargs functionality.

That's incorrect, it's perfectly valid as a named parameter.

Yuck. It's hopefully (?) uncommon enough we could lint uses away then steal it? Other prefix symbol suggestions welcome as well.

A drawback to that syntax is that it doesn't allow you to skip a caption.

Yeah, no arrays with holes. You'd have to specify |caption=| as a placeholder to keep everything lined up. The implication is that a "variable length list of optionally-typed parameters" doesn't work very well, ie if you could specify video1 or sound1 or media1 to go along with each caption1. The merged video / sound / media varargs arrays end up with a lot of placeholders.

It could be mitigated if there were support for structure elsewhere, so it was just entry=...|entry=...|entry=... and each entry had a caption and one of several optional properties, etc. But that seems like an orthogonal issue.

Returning to this after some time: maybe the full syntax should be:

|file[1]=....
|caption[1]=....
|file[2]=....
|caption[2]=...

with the repeated-argument form used as a shorthand? So for most cases:

|file=
|caption=
|file=
|caption=

would work fine, and localizers wouldn't have any issues with conjugation or localized numerals, but for unusual cases (where you were skipping a caption, for instance), you could fall back to the explicit form:

|file=
|caption=
|file=
|file=
|caption[3]=

That allows for 'arrays with holes' in the odd corner cases where they are useful.
I think some of the special forms are still useful:

{{{#|file}}} - the maximum index in the vararg (same as how PHP handles $foo[])
{{{#=|file}}} - all varags as a list, preserving holes

but then for individual arguments, should we be using {{{#1|file}}} or instead {{{file[1]}}} ?

Personally, I think I'm going to conclude that arrays with holes are more trouble than they are worth -- the array index syntax looks bad and probably doesn't localize well, and having to figure out the right index to use in order to properly 'reset' the numbering after a hole is certainly not obvious to an editor (counting from the beginning on a long list will get tedious!) and will break horribly if something is added earlier in the list. The whole point of not using the current de facto syntax was to avoid having to make a bunch of nonlocal changes to indexes further down in the list when you insert something at the start, and we've reintroduced that.

So after thinking about it out loud, in case it spurs further discussion, I'm going to fall back to my earlier recommendation, T204366#5787301, and say that arrays with holes are explicitly not supported.

For parallel structures probably structured values are best, eg: |entry={{GalleryEntry|file=Foo|caption=bar}} or even just {{GalleryEntry|file=Foo}} which can expand to both a file= and a caption= entry. That latter form presumes that "key-value pairs" are one of the primitive "types" that templates can expand to, but I think that's pretty accepted/acceptable, given the frequency of usage of templates of that general form on wiki. There's a wrinkle that key names aren't unique, and that the pair list is therefore also ordered, but I think that's acceptable/expected.

Many languages with both numbered and named parameters (like wikitext has, and python) accommodate this also in their varargs, which we should probably consider. That is, "the first item" of the vararg list might have both a *name* as well as a *number*. In theory {{{#=|file}}} (or whatever syntax we chose) should be able to generate not only x|y|z but k1=x|k2=y|k3=z as needed. More thought needed here.