Page MenuHomePhabricator

Types of formatting in TemplateData based on whitespace alignment by longest parameter
Open, Needs TriagePublic

Description

In most cases on wikipedias of different languages, the documentation tells you to edit and format the code of infoboxes in this way:

{{Template
| name              = 
| short             = 
| longest parameter = 
| another           = 
}}

Somewhere it's just mentioned in the guideline text as a matter of course, like on the English wiki (1, 2), and somewhere, like on the Russian wiki it's explicitly stated as a guideline in the rules or by the consensus: here or here.
To design infoboxes this way, you have to manually count the number of characters of the longest parameter each time. And insert a custom formula like this

{{_↵| ________ = _↵}}↵

(where "_" is multiplied by number of symbols in the longest parameter). I can safely say that this is the basic template design practice of the biggest interlanguage wikipedias.

At the same time, this practice does not handle cases when long parameters are not used in a particular use of the template, but the visual editor changes the layout as if the parameter is used. Then the formatting of the template is made like this:

{{Template
| name                        = 
| short                       = 
| another                     = 
}}

This creates unnecessary spaces and code stretching. In some cases, a large number of spaces looks particularly unnatural and odd, and in a long list it can be difficult to quickly associate a parameter with a value without drawing dotted lines in the mind. This contradicts what such formatting was created for - to make the code easier to read.

In some cases, because of this, when writing a formula in TemplateData, you have to ignore the longest parameters that are rarely used (e.g. in 5% of uses) so that the infobox won't be stretched out in most articles (in 95% of uses). Then a fully filled infobox looks like this:

{{Template
| name              = 
| short             = 
| longest parameter = 
| the longest parameter in the infobox = 
| the second longest parameter in the infobox = 
| another           = 
}}

And this is already against the template formatting guidelines.

pic.jpg (857×482 px, 95 KB)

This situation is handled by gadgets aimed at standardising formatting in a particular article. The gadget truncates spaces, based on a long parameter in each case. On Russian wikipedia it's this one gadget, but I don't know of any such gadgets on English wikipedia (and I wish I did).

That's why I think it would be useful to have additional code formatting buttons in TemplateData, which would auto-regulate formatting alignment by the longest template parameter in a particular article. So that in one article, the visual editor would save the template like this:

{{Template
| name              = 
| short             = 
| longest parameter = 
| another           = 
}}

And in the other one like this:

{{Template
| name    = 
| short   = 
| another = 
}}

So I propose to deal with this situation by adding two separate new formatting options:
1. Fixed space alignment
In this case TemplateData will automatically find the longest parameter, and any editing of the template through the visual editor will save the code with fixed number of symbols between | and =, regardless of whether there is a long parameter in the article or not. This will help to avoid switching from fewer to more spaces when some wikis treat changeable spaces negatively. This is actually an automation of the practice that we do every time manually through custom formatting, but now always with a fixed length. That is, this option will always save the template this way:

{{Template
| name                  = 
| short                 = 
| another               = 
}}
{{Template
| name                  = 
| short                 = 
| longest parameter     = 
| another               = 
}}
{{Template
| name                  = 
| short                 = 
| longest parameter     = 
| the longest parameter = 
| another               = 
}}

and the second option is
2. Dynamic space alignment
This is more flexible formatting, where the visual editor will adjust the number of spaces for each particular article based on which parameter is the longest in every use. This is the most optimal formatting option, which would be both the most readable and respect the guidelines of wikipedia in each case. But I guess it will be a bit more difficult to implement. It should work this way:

{{Template
| name              = 
| short             = 
| longest parameter = 
| another           = 
}}
{{Template
| name    = 
| short   = 
| another = 
}}
{{Template
| name                                  = 
| short                                 = 
| longest parameter                     = 
| the longest parameter in the infobox  = 
| another                               = 
}}

It seems to me that such a change would be extremely useful, as this is literally the dominant formatting in a most of the biggest wikipedias, that is also most often prescribed in the rules. And the fact that the visual editor has no way of automating this, requiring you to do character counts in every template, and then stretching spaces where it could have been avoided is a huge omission. Adding new styles would add significantly to the convenience and automation of working with the visual editor and TemplateData, and in many cases would fix the non-uniformity of formatting, where in one case it would be more useful to have option 1 = automate the current practice, and in another use option 2 = which is impossible with the current version of visual editor. Such a change would make the use of the visual editor much more tolerant to wikipedia norms.

Event Timeline

It sounds like number 1 is pretty much what people already do: We count how long the longest parameter is and add that number of _ characters to the format string.

To be honest I'm not really sure what the benefits are when we fully-automate this. This typically only happens a single time per template. We would need to add quite a bit of complexity for barely any benefit. Instead we possibly run into new issues: Let's say someone adds a debugging parameter to the template that is not meant to be used in articles, and will be removed shortly. Something like this can potentially start messing up all existing usages of the template for no obvious reason. We would need some kind of "maximum length" to avoid this – but that's pretty much exactly what the existing format string already does, isn't it?

The second proposal makes me wonder for a similar reason: Not only is it confusing that the same template behaves different in every article. Just by adding another parameter to an existing template usage in an article can cause a full re-format of the entire template usage. To avoid that we would need some kind of "minimum length". But again: isn't this this the purpose of the format string?

The situation is that people do option 1, but the guidelines require option 2. Option 1 is what allows you to do custom formatting in the current TemplateData, and the more optimal option 2 is just not technically possible right now. In that sense, option 1 doesn't seem optimal either, but at least it automates the current TemplateData formatting state.

Motivation:
The benefits of automation is that it gets rid of monotonous actions and occasional updates, and brings the code into a style required by the guidelines of most major wikipedias. Very few people within their projects are familiar with the necessary code layout guidelines. Even fewer know how to use the custom formatting field in TemplateData. There are thousands of templates on Wikipedia. So we get monotonous character counting with manual code writing by a small number of people multiplied by thousands of times. And yet, most templates go unnoticed for years and are laid out in a messy way. And considering that newbies often use the visual editor, these are all reasons for the constant mess of template code in articles. (And I've even seen a few edit wars over spaces in template parameters).

It is also worth noting that one of the current ways of designing templates (the "block" type) is not recommended at all by the enwiki and ruwiki guidelines. That is, TemplateData does not allow you to follow Wikipedia's guidelines in automatic mode, and it has to be done manually each time by the efforts of a small group of people who know what to do.

By my estimates, only about 30% of the infoboxes on enwiki and ruwiki have proper formatting. It is very rare for templates to be properly customized even within a single project. I mostly work within music and film/TV projects on Russian and English Wikipedia, and I can assure you that the calculation of these symbols to comply with style guidelines is usually done by literally 3-5 people on each Wikipedia over the years, and the situation is far from uniform even within these two projects. Over the course of several years, new unnoticed nuances are constantly being found.

Replacing this set of repetitive actions with just a single button would improve the quality of the code layout created by templatedata+visual editor. And it would be great if next it would be possible to somehow automate such a default layout for all infoboxes at once instead of the default "no format" option. Or at least it will actually be possible to change the style of all templates at once with a bot.

So I think the benefit of automation is obvious here.

Problems:
I don't see the debugging parameter case you describe as a problem. First of all, such a parameter should be added to the TemplateData (which is rarely done in such cases). And if it is added, it can be marked deprecated directly. It should not be a problem to make an exception for counting characters other than deprecated ones (which is actually already done with the current formatting options).

The problem with using the format string to calculate the maximum length is that it would have to be done again each time, and by people who know what to do, so that is not optimal. And I don't know in which cases the maximum custom length of individual templates might be needed. The correct way would be to specify the maximum length for the entire Wikipedia in the templatedata configuration. We had such a discussion in the technical chatroom of the Russian wikipedia discord, and there we agreed that 25 characters would be a normal limit for the maximum length of all templates. This should be sufficient for the two proposed options.

I don't really understand what the "minimum length" might be for. We have to follow the template usage guidelines. They require to follow the longest parameter length with spaces in every template usage. (Note that the current custom formatting does not allow us to follow this rule). In the proposed 2nd option, the counting should always be on the current template usage in the article, i.e. it would be a dynamic counting on existing parameters and the "minimum length" is not needed there. And in the first option it seems that the "minimum length" is not needed either?

Let's say there is

{{Template
| name    = …
| short   = …
| another = …
}}

in the article and all I want to do as a normal editor is to add the line | longest parameter = …. Out of a sudden I have to re-format the entire template? Why?

Let's say my "edit area font style" isn't even set to "monospaced". How am I supposed to do this then? Does this imply that the "guidelines of most major Wikipedias" actually force users to set their user preferences in a specific way?

I don't think this is the best place to discuss whether or not the current styling consensus is correct. If you're not using a monospaced font, that's probably your problem. Like if you installed a custom dark theme on wikipedia and started arranging colors that would violate accessibility rules - this will be more critical, but roughly the same situation.

Here I can only confirm that there is a consensus for such a layout in most major wikipedias at this point. And speaking of that, I mean that I see this formatting on literally every top10+ wikipedia in infoboxes on various topics (you can check the code of infoboxes on any major recent topic, e.g. https://www.wikidata.org/wiki/Q64441774 ). I don't know how much in those languages it's written into their rules, but it's unlikely to be the other way around.

And yes, if you add "the longest parameter", you should properly reformat all other spaces by that parameter. That's the consensus on formatting, and it's explicitly stated on the Russian wiki. It's not really enforced, but it should be in the final version of the code. And by the way, such reformatting is now happening when you edit the manually written code with a visual editor, and if you also cut off extra spaces with a gadget (as required by the guideline), then the next time you edit with a visual editor, all the spaces are returned. So what happens is constant trimming and appending with spaces, since TemplateData + visual editor doesn't allow you to avoid this.

The last part I don't understand. VisualEditor strictly follows what is specified via TemplateData and always formats the template the same way. There is no "constant trimming and appending". This only happens when another editor uses a gadget that doesn't follow the agreed on specification.

My main point is that the proposals in this ticket – especially the second one – don't make this situation better but worse.

What I meant in the last part is that technically there is no way to follow the infobox markup guidelines (option 2) in every case by using the TD+VE pair. Gadgets follow the Wikipedia's agreed specification, not the other way around. TemplateData + VisualEditor mostly don't regarding infoboxes. This is because of technical limitations (the point of this request).

VisualEditor only formats the code correctly if the longest infobox parameter is used in the current template use. We specifically look at the formatting of each individual use of the template because there is no need to specify all the parameters in each case, and it doesn't happen. In some cases, the longest parameter is used in 0.01% templates. Like here: 5 times out of 20.5k "infobox album" uses. Some parameters are used only with specific articles in 10% and there's will never be a need in extra spaces. And the VisualEditor can't take that into account - you have to ignore rarely used parameters with it, instead of directly following guidelines.

You're probably looking at usage in the abstract, but I'm talking about practical cases. Switching spaces back and forth happens regularly in articles, because VisualEditor can't do it any other way. Every time the VisualEditor edits proper formatting, it reverts it to worse. And as I already mentioned it provokes problems regarding unnecessary edits - I've encountered few people just going through articles and cutting extra spaces after the visual editor - and that's a problem. The point of my request - to add such options so that this doesn't happen.

And I don't understand why you think option 2 makes the situation worse. This markup is both more convenient and follows the guidelines. Its addition is a definite improvement over what's happening now, as any options offered by TemplateDate+VE aren't really suitable for most infobox use cases.