Page MenuHomePhabricator

Guess parameters for templates without TemplateData
Closed, ResolvedPublic5 Estimated Story Points

Description

If a template has no templatedata, we should attempt to extract the basic parameter names from the template and present them as-is (i.e. with basic text inputs and no labels or types or defaults etc.).

As is done by
https://en.wikipedia.org/wiki/User:%D7%A7%D7%99%D7%A4%D7%95%D7%93%D7%A0%D7%97%D7%A9/TemplateParamWizard.js
and https://de.wikipedia.org/wiki/Wikipedia:Technik/Skin/Gadgets/Vorlagenmeister

(Requested at https://en.wikipedia.org/wiki/User_talk:Samwilson#Template_parameters_wizard )

The TemplateData editor now supports scanning the template source code and inferring a list of parameter names to bootstrap the documentation editor.

We can use the same logic server-side to provide a basic TemplateData object for templates lacking it. Possibly with a warning outputted alongside the object. Similar to how MediaWiki's query api communicates things like "warnings", "normalized", and "badrevids".

Event Timeline

Samwilson renamed this task from ' to Guess parameters for templates without TemplateData.Apr 9 2018, 2:43 AM
Samwilson updated the task description. (Show Details)

It seems like it might be nice to add an option to the TemplateData API, e.g. extend the doNotIgnoreMissingTitles parameter to also return guessed parameters, or add an new guessmissing that does the same. No metadata about these parameters would be available of course.

Alternatively, we could build this in to the TemplateWizard (it's a matter of retrieving the raw template page's wikitext and extracting the possible parameters with /{{3,}(.*?)[<|}]/m) but that'll be an extra API call, and it does seem that this functionality could be useful for others.

It seems like it might be nice to add an option to the TemplateData API, e.g. extend the doNotIgnoreMissingTitles parameter to also return guessed parameters, or add an new guessmissing that does the same. No metadata about these parameters would be available of course.

Alternatively, we could build this in to the TemplateWizard (it's a matter of retrieving the raw template page's wikitext and extracting the possible parameters with /{{3,}(.*?)[<|}]/m) but that'll be an extra API call, and it does seem that this functionality could be useful for others.

I agree that it seems like a better idea to add this to TemplateData. @Krinkle - thoughts?

Change 427839 had a related patch set uploaded (by Samwilson; owner: Samwilson):
[mediawiki/extensions/TemplateData@master] Guess template parameters when templatedata is missing

https://gerrit.wikimedia.org/r/427839

Change 427853 had a related patch set uploaded (by Samwilson; owner: Samwilson):
[mediawiki/extensions/TemplateWizard@master] Use rawparams for templates without templatedata

https://gerrit.wikimedia.org/r/427853

Note that the template data editor already does this, so that would be the first place to look.

Looks like you found it already.

Change 427839 merged by jenkins-bot:
[mediawiki/extensions/TemplateData@master] Guess template parameters when templatedata is missing

https://gerrit.wikimedia.org/r/427839

@Deskana James just pointed out that we didn't get your approval for the above change. My apologies about that. Do you have any objections/thoughts? It's merged but won't be out until the next train, so we have some time to follow up on any changes we need to do.

There's no description of the user problem here, so let me see if I understand first. From what I can tell, the problem here is that some templates have no template data written for them; this means that when a user tries to insert or modify a template, the visual editor doesn't tell the user what the parameters are or what they do, making the process harder for the editor. Is that right?

The best way to solve this problem is to get people to write template data for templates that are missing it. To some degree, this is a bit naive; some wikis don't have enough editors to do so, or the person that actually understands the template might not be around any more. But, it'll be very reliable, and it's the "proper" solution.

Trying to guess parameters is risky. Firstly, it might get it wrong, which potentially lets the user input something that's incorrect. Is giving an editor incorrect guidance better than giving them no guidance at all? I don't know. Secondly, it probably won't be able to provide descriptions, since those are semantic and require a person with understand of the template. Thirdly, there could be a paradoxical reaction where people might not bother writing template data since there's something there kind of automating it for them, perpetuating the problem. Editors frequently get used to the "magic" solution, choose not to take the time to help with the "proper" solution (which isn't unreasonable considering there's something working already), then when the fragile "magic" solution breaks, they get angry. That seems like a bit of a stretch, but things like this do happen, particularly in the editing space.

I'm not an engineer, so I can't tell purely from the code what the probability of the guesses being wrong is; if it is minuscule, then my concerns are probably huge overreactions, although I'd still prefer that the software not encourage sloppiness. I can't say I object because I don't know what the probability is that this won't work.

There's no description of the user problem here, so let me see if I understand first. From what I can tell, the problem here is that some templates have no template data written for them; this means that when a user tries to insert or modify a template, the visual editor doesn't tell the user what the parameters are or what they do, making the process harder for the editor. Is that right?

That's correct. VE instead lets them type out fields and corresponding values. The aim is to make that process easier for the user.

Trying to guess parameters is risky. Firstly, it might get it wrong, which potentially lets the user input something that's incorrect. Is giving an editor incorrect guidance better than giving them no guidance at all? I don't know.

Here's as I see it:

  • Option 1 is the software does nothing to help the user and we leave it up to them to either: look at the template code and figure out what they want OR find an article which uses that template too and copy it over to the current article and modify it.
  • Option 2 is using the software to make an educated guess about the template parameters. We can be wrong sometimes but it won't be very common. The user can potentially end up having an extraneous field or a missing field but they can always go ahead and do what they were doing in Option 1 to fix it.

I understand that there can be potential pitfalls to this and that's why we give the user a warning message about parameters being guessed and to not rely on them absolutely and encourage them to add TemplateData for template.

Secondly, it probably won't be able to provide descriptions, since those are semantic and require a person with understand of the template.

Agreed.

Thirdly, there could be a paradoxical reaction where people might not bother writing template data since there's something there kind of automating it for them, perpetuating the problem. Editors frequently get used to the "magic" solution, choose not to take the time to help with the "proper" solution (which isn't unreasonable considering there's something working already), then when the fragile "magic" solution breaks, they get angry. That seems like a bit of a stretch, but things like this do happen, particularly in the editing space.

Maybe this could trigger a reaction in the opposite way - people start adding TemplateData more because the lack of descriptions and custom inputs for field types missing is annoying? Or fields end up being missing/incorrect? My understanding is that a lot of people don't know about TemplateData or understand how it works currently. Especially on smaller projects like Hindi where hardly any template has TemplateData (from the 20+ templates I looked at).

I'm not an engineer, so I can't tell purely from the code what the probability of the guesses being wrong is; if it is minuscule, then my concerns are probably huge overreactions, although I'd still prefer that the software not encourage sloppiness. I can't say I object because I don't know what the probability is that this won't work.

I'll chat with Sam to see if we can get some idea on what the probability is that the parameters will be wrong. My understanding is that it will be low and only for very complex templates. We can of course improve the regex and add tests as we go along and false positives turn up.

I do have a concern about this feature going live for VE automatically without any messaging that warns people that the parameters are guessed. I'm not sure if that will be happening with this change rollout though. @Esanders could you chime in? Thanks.

I totally agree that the best solution is for people to add templatedata to templates that don't have it. But there are lots of reasons they don't, like not knowing and whatnot. There's also the case (I've seen on a few places) where template documentation is shared between a bunch of templates, but the specific parameters vary a bit.

Anyway, from what I've seen so far, there aren't that many false positives with the raw parameters. (They're extracted simply by finding all the triple-braces and grabbing the text after them.)

The most common error is where there are alternative or deprecated or alias names, such as:

{{{Other_versions| {{{other_versions| {{{other versions|}}} }}} }}}

So I'll make a new patch to support collapsing these into the first found wherever there is only a space, letter-case, underscore, or hyphen difference between parameter names. Does that sound okay? Of course, there might be templates that do want to be able to have other_versions and Other_versions in the same call, but I can't find any and there are certainly more doing it this way.

We'll show a good message whenever we're using raw params, and try to encourage people to fix errors by creating templatedata. Can we link directly to the templatedata generator — even better, prefilled with the raw params? I'm assuming not, because it's a OOUI dialog. Is it worth thinking about modifying it to support being launched by a link in some way?

I do have a concern about this feature going live for VE automatically without any messaging that warns people that the parameters are guessed. I'm not sure if that will be happening with this change rollout though. @Esanders could you chime in? Thanks.

I would expect that the API shouldn't return the guessed fields unless explicitly requested, so that VE's editor isn't changed in any way. The template editors should be kept as close as possible in UX, so after adding this feature to the old editor, we should work on getting it into VE's editor.

I do have a concern about this feature going live for VE automatically without any messaging that warns people that the parameters are guessed. I'm not sure if that will be happening with this change rollout though. @Esanders could you chime in? Thanks.

I would expect that the API shouldn't return the guessed fields unless explicitly requested, so that VE's editor isn't changed in any way. The template editors should be kept as close as possible in UX, so after adding this feature to the old editor, we should work on getting it into VE's editor.

The doNotIgnoreMissingTitles parameter appears to be used in two places in VE - in ve.dm.TransclusionModel and in ve.ui.MWTemplateTitleInputWidget. I don't see any documentation for what it needs the missing fields for. I guess one of the places is the suggested fields feature in the TemplateData dialog. Anywhere else?

Anyway, from what I've seen so far, there aren't that many false positives with the raw parameters. (They're extracted simply by finding all the triple-braces and grabbing the text after them.)

The most common error is where there are alternative or deprecated or alias names, such as:

{{{Other_versions| {{{other_versions| {{{other versions|}}} }}} }}}

So I'll make a new patch to support collapsing these into the first found wherever there is only a space, letter-case, underscore, or hyphen difference between parameter names. Does that sound okay?

That sounds good to me. I created a ticket for that here - T193265.

https://gerrit.wikimedia.org/r/#/c/427853/ is ready for review.

But I was wondering if we wanted to make the field label for numbered parameters a bit more descriptive, because at the moment it'll look like this:

Screenshot-2018-4-30 Editing Main Page - Dev Wiki wiki1.png (388×530 px, 12 KB)

We could change the numbered ones to say something like "#1 (un-named)", or maybe add help-text of "Numbered parameter." It just feels a bit odd to have them appear as the other ones do, but then the numbers not be present when the template is inserted.

Switched to:

Due to missing [TemplateData], parameters for this template have been auto-generated. Please be aware that they may not be accurate.

And no changes made for now to the numbered params.

Change 427853 merged by jenkins-bot:
[mediawiki/extensions/TemplateWizard@master] Use rawparams for templates without templatedata

https://gerrit.wikimedia.org/r/427853

Niharika moved this task from Needs Review/Feedback to Q1 2018-19 on the Community-Tech-Sprint board.

Calling this done! \o/

This is excellent \o/

+1

peace

Change 445656 had a related patch set uploaded (by Jforrester; owner: Samwilson):
[mediawiki/extensions/TemplateData@REL1_31] Guess template parameters when templatedata is missing

https://gerrit.wikimedia.org/r/445656

Change 445656 merged by jenkins-bot:
[mediawiki/extensions/TemplateData@REL1_31] Guess template parameters when templatedata is missing

https://gerrit.wikimedia.org/r/445656