The current ad hoc mechanisms used to pass a variable number of arguments to a template/parser function/scribunto module don't play nicely with TemplateData (and thus with VisualEditor). Typically you see something like:
{{MyBigTemplate |file1=.... |caption1=.... |file2=.... |caption2=.... }}
But then you have to localize all these parameter names (the file part) and ideally handle the fact that different languages use different numeral systems and potentially even need to conjugate the parameter name to agree with the quantity. (Different suffixes for 1, 2, or 3 or more of a kind....) TemplateData doesn't support this type of argument cleanly, and maybe shouldn't...
So what would be better?
Scarecrow #0: Let's just work with what the editors are already doing, and try to some way to describe this pattern in TemplateData, and good tools to localize the parameter names.
Scarecrow #1: var args arguments should really take JSON inputs, and we can use heredoc syntax (T114432) to make embedding JSON not-so-awful:
{{MyBigTemplate |files=<<<[ {"name": "...", "caption":"..."}, {"name": "...", "caption":"..."}, {"name": "...", "caption":"..."}, {"name": "...", "caption":"..."} ]>>>}}
This would presumably look fine in VE when editing, and getting TemplateData to describe a typed JSON argument is probably not too hard, but it's a jarring syntax shift.
Scarecrow #1b: Keeping a fundamental "JSON array" semantics, we could use helper templates to make the syntax look nicer for humans:
{{MyBigTemplate |files=<<< {{item|name=...|caption=....}} {{item|name=...|caption=<<<...>>>}} >>>}}
Some hand-waving here to make sequences of {{item}} generate a JSON array with the proper head/tail.
Scarecrow #2: Something "wikitext-native", which reuses our existing parameter syntax.
{{MyBigTemplate |files= <<<name=... | caption=...>>> <<<name=... | caption=...>>> <<<name=... | caption=<<< really long caption >>> >>> }}
In this last option we'd take advantage of the fact that we don't have any special behavior defined for >>>\s*<<< and use that as a magic marker to indicate "pass this sequence of things as an array". The template author would have to be aware that they would either get a string as an argument (if there was only one item) or an array (if there was more than one).
I'm not too thrilled with any of these options. Do folks have better ideas?