Page MenuHomePhabricator

Interlaced, single and other non-matching brackets are not highlighted
Open, Needs TriagePublic

Description

When paired or single brackets are mixed with paired brackets of a different type neither of them will be highlighted. See these examples and following screen cast.

  • {[{]
  • }[{]
  • ([)]
  • ([)]

Peek 2021-01-20 12-31.gif (226×540 px, 65 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
WMDE-Fisch renamed this task from Brackets not highlighted when mixed with unmatched brackets to Brackets not highlighted when mixed with different bracket type.Jan 21 2021, 2:54 PM

We kept wondering about situations like these, e.g. in T262255#6521041, and decided to disable the red error highlighting for non-matching brackets. This is part of the acceptance criteria in T261857. I can't find the ticket any more where this was originally discussed, sorry.

All the examples above are bad for the same reason HTML like <a><b>…</a></b> is bad. While browsers try to make sense of strange interlaced pairs like this, it's almost certainly a mistake and should be fixed.

I'm not sure what the expected behavior is. There are multiple ways to render these examples, and none of them is more "correct" than the other. It might just be a misunderstanding. See, the matchbracket algorithm does not simply search for a matching character, wherever it might find one. That's not really helpful when we talk about source code. The algorithm actually understands the structure of nested brackets. It finds the position where the matching bracket should be and only then checks if the character at this position is the expected one. Non-matching brackets are currently not highlighted via T261857, but could be highlighted in red.

thiemowmde renamed this task from Brackets not highlighted when mixed with different bracket type to Interlaced, single and other non-matching brackets are not highlighted.Jan 27 2021, 5:20 PM
thiemowmde removed a project: Community-Tech.
thiemowmde added subscribers: Lena_WMDE, ECohen_WMDE.
thiemowmde added a subscriber: lilients_WMDE.

@ECohen_WMDE, do you remember why we disabled the (red) highlighting of non-matching brackets, and where this decision is documented?

In the description of this ticket it says:

Scope
Excludes:
...
Unmatched bracket error highlighting

I left another comment in T272454#6795430 that might help to understand what's going on.

I can think of at least one real-world example where this is relevant: Templates like {{(}} actually exist and are used a lot, see https://en.wikipedia.org/wiki/Template:(.

There are also situations where a single ] or an extra bracket at the end of [[a]]] or {{a}}} isn't syntax but a literal bracket. But this should be rare.

A possible improvement is to change the behavior so that only [ and { (only these are syntax) look for a strict match, but ( (it's never syntax, at least not in wikitext) is much more relaxed and does not interfere with the other bracket types. This might help in situations like {{(}}. But it can't solve everything.

Personally I don't like the idea to "dumb down" the algorithm so that it ignores syntax and nesting structure.

A possible improvement is to change the behavior so that only [ and { (only these are syntax) look for a strict match, but ( (it's never syntax, at least not in wikitext) is much more relaxed and does not interfere with the other bracket types. This might help in situations like {{(}}. But it can't solve everything.

This makes sense to me, especially if we already know of a common situation. If we implemented this, ( ) would still be matched right? I think otherwise, @thiemowmde you're right that it shouldn't need to necessarily detect these interlaced matches.

do you remember why we disabled the (red) highlighting of non-matching brackets, and where this decision is documented?

I think this decision is documented in the [wish] doc, but I can summarize here. There were a couple reasons:

  • It was discussed that defining what is an error and consistently catching all of them would be a large amount of effort, possibly impossible
  • Generally users assumed in testing that if errors are caught sometimes, that implies they are all being caught and assumed that if there was no error message then it was correct. If we cannot consistently detect errors, then this could cause additional confusion.
  • Highlighting in red only when the cursor was next to the bracket was not much help in catching errors since you have to already know the location in order to get the feedback (vs. showing at the line number - but changing this way of displaying the error was said to be too much effort/out of scope).

[…] If we implemented this, ( ) would still be matched right?

In theory, yes. Hard to tell how this would turn out. Depends a lot on the details. As of now, in an extreme example like {{(}} {{)}} nothing matches properly, because the different brackets interfere with each other. The main reason to design the highlighter like this is to make it focus on possible errors, and reliably mark them in red.

But we don't mark errors. We can relax the highlighter and make it focus on positive matches instead, without loosing functionality (reliable error detection).

  1. One possibility is to have two groups of brackets, [ ] { } and (), that work entirely independent from each other. For example, an opening ( would find the next ), but ignore everything from the 1st group (and vice-versa). Examples:
    • ( ( ) would still be an error (conflict with a bracket from the same group).
    • {{(}} is not an error any more (brackets from two different groups ignore each other).
    • Stuff like [[Link|{{Template}]] or {{Template|[[Link]}} is still an error, as before (both bracket types are in the same group and can interfere with each other).
  2. Another possibility is to simply make all bracket types independent from each other. Example:
    • Notice the missing } in [[Link|{{Template}]]. When the different bracket types are allowed to interfere with each other, the outer [[]] are never highlighted, or wrongly highlighted. While we don't show this in red, it is still a strong hint that there is something wrong, simply because the highlighting is missing or obviously broken. This would not happen any more when we make the bracket types independent from each other.

As said, in wikitext [ and { are almost always syntax, but ( never is. Therefor I would love to implement #1, if possible. Should we create a ticket for this, or rephrase this one here accordingly?

Thanks for the breakdown @thiemowmde !

Based on your description option #2 makes a lot of sense to me, helping users find the actual missing bracket instead of breaking the matching for brackets which are in fact matched. Are there any downsides to that version that you're aware of?

Therefor I would love to implement #1, if possible. Should we create a ticket for this, or rephrase this one here accordingly?

I agree that it seems like a pretty straightforward way to improve the situation with one of these options. Otherwise it sounds like it could be relatively often that the bracket matching seems broken. Maybe it makes sense to update the ticket with one of the options and @Lena_WMDE can then prioritize and decide when/if to implement?

… but option 2 makes it harder to find a missing bracket. It's what I described as "dumb down" above. Let me try to illustrate:

  1. [◟[◞Link|{{Template}]◟]◞ is the current behavior. Note how the wrong ] at the end is highlighted. This will not change with option 1. Errors often "propagates" outwards and are visible in a wide range of possible cursor positions. Either that, or nothing is highlighted – which is also a strong indication that something is wrong.
  2. Option 2 means the error is pretty much invisible. It always behaves as if everything is ok:
    • ◟[◞[Link|{{Template}]◟]◞
    • [◟[◞Link|{{Template}◟]◞]
    • [[Link|◟{◞{Template◟}◞]]
    • [[Link|{◟{◞Template◟}◞]].

[…] which are in fact matched.

That's the problem: there is no way to know which interpretation of the situation is the correct one. In [[Link}] the two outer […] belong together. This is the assumption the current algorithm makes. With option 2 both ◟[◞[Link}◟]◞ and [◟[◞Link}◟]◞ are highlighted as if everything is ok, even if one of the two is wrong. Which one? We can't know.

The argument to put []{} in one group and () in another is not only because one is syntax and the other is not. []{} are pretty much always used as pairs. They form a syntax tree, and this is what the current algorithm respects. The chance that a stray [ is an error is extremely high. On the other hand ( is not necessarily an error, as seen in {{(}}.

Conclusion from discussions during story time are:
Since [ and { are almost always syntax, but ( never is*, we will implement option 1: two groups of brackets, [ ] { } and (), that work independent from each other. This will likely solve some of the most common cases without changing the logic significantly.

*CodeEditor takes over for code content, e.g. Javascript and Lua, where round brackets are syntax.