Page MenuHomePhabricator

Syntax for list item attributes
Open, Needs TriagePublic


This is a placeholder task, because I'm not sure exactly what syntax is best here.

Currently in order to add class/id/etc attributes on section headings or list items (common elements in talk pages) we need to switch from wikitext to HTML syntax. That is, <h2 class="foo">....</h2> instead of == .... == and <dl><dd class="foo">....</dd></dl> instead of : ....

That's unfortunate: this syntax looks ugly, which means it is hard to use attributes to record additional information about comments, for example comment outdents (class="outdent" or data-talk="outdent") or human-readable ids on individual comments (id="foo"). See T230659: Automatically-assigned id attributes for list items for more information on how these list item attributes could be useful.

Tables, table rows, table cells, and table captions already have wikitext syntax for attribute, which may or may not be a good model here.

Additionally, talk page list items are expected to use heredoc syntax when the contents get 'complicated', so we might imagine tying attribute syntax to heredoc syntax in order to minimize backward compatibility concerns.

Existing table syntax:

{| class="wikitable"
|+ class="bar" | caption
|- style="foo"
! class="bar" | cell
|- style="foo"
| style="foo" | cell

Some options for lists and headings (for discussion only; I'm not actually endorsing any of these at this point):

:::<attr id=foo class=bar/> xyz ("magic extension")
:::{{#attr|id=foo|class=bar}} ("magic parser function")
:::|id=foo|class=bar| ("like table syntax")
:::[id=foo][class=bar]  ("like CSS syntax")
::: id=foo class=bar <<< xyz >>> ("requires use of heredoc syntax")
::<dd id=foo class=bar> ("explicit tag, but don't require dl wrapper, etc")

===<attr id=foo> foo ===
=== id=foo class=bar <<< heading >>> ===

Event Timeline

cscott created this task.Aug 17 2019, 2:07 PM
Restricted Application added subscribers: Liuxinyu970226, Aklapper. · View Herald TranscriptAug 17 2019, 2:07 PM
cscott updated the task description. (Show Details)Aug 17 2019, 2:17 PM
Anomie added a subscriber: Anomie.Aug 19 2019, 5:33 PM

Tables, table rows, table cells, and table captions already have wikitext syntax for attribute, which may or may not be a good model here.

Probably not a very good model. I can't recall anyone ever actually liking wikitext table syntax beyond that it saves a few keystrokes for simple tables.

:::<attr id=foo class=bar/> xyz ("magic extension")
:::{{#attr|id=foo|class=bar}} ("magic parser function")
:::|id=foo|class=bar| ("like table syntax")
:::[id=foo][class=bar]  ("like CSS syntax")
::: id=foo class=bar <<< xyz >>> ("requires use of heredoc syntax")
===<attr id=foo> foo ===
=== id=foo class=bar <<< heading >>> ===

IMO all of these are pretty awful, in that they're complicated syntax subject to user confusion and accidental breaking.

Another idea would be almost like the first:

:::<dd id=foo class=bar> xyz

Difference is that it's not a self-closing tag and the tag matches the list element.

That actually more or less works already since Remex does almost the right thing to the invalid HTML that Parser.php outputs for that wikitext (namely <dd><dd id=foo class=bar> xyz</dd><dd></dd><dd id=foo class=bar> xyz</dd>), and for <li>-based lists we somehow wind up with <li class="mw-empty-elt"></li> which does a right-ish-looking thing too.

Still rather confusing, but a bit more logical if you're already used to the HTML equivalence for the same reason it more or less works with Remex.

cscott updated the task description. (Show Details)Aug 22 2019, 5:59 PM

That's a reasonable alternative; I've added it to the list in the task description. There are some weird corner cases w/r/t properly closing the list; I think we want some sort of multiline list syntax anyway (T230683: New syntax for multiline list items / talk page comments), so it might make sense to tie the attribute syntax to that. But there are multiple proposals for multiline lists, too.

Anyway, early days. Interested to hear continuing thoughts. I personally am starting to favor the heredoc syntax both for this and for T230683 because it seems to unify the proposals, but it's fair to say we're nowhere near consensus yet.

Izno added a subscriber: Izno.EditedOct 5 2019, 7:41 PM

Syntax for attributes is something I made a task for at T202083: First-class wikitext support for ordered list item value which I closed duplicate of the problem I was looking to solve in that context. General gist I suggested was similar to the table syntax, namely: # id=A class=X | BCD where BCD is the list item content. I can see users going for that syntax. I don't really understand the proposed "table" syntax above (:::|id=foo|class=bar| ("like table syntax")) since that's not how table syntax works. "like", but definitely not.

stjn added a subscriber: stjn.Oct 10 2019, 4:00 PM

I can’t comment on syntax itself, but we really shouldn’t use (or add the ability to use) definition lists for this (: syntax). It makes extremely broken and unaccessible HTML, which is something to avoid in a tool written by WMF. See explanation here:

I agree, but I think changing the DOM tag is out of scope for this this ticket, as doing so would probably break thousands of on wiki style rules and gadgets.

stjn added a comment.Oct 16 2019, 4:33 PM

Wouldn’t syntax like this be used only by newer talk pages anyway? So if they will use the bad tags, it is concerning. Maybe I should’ve written this in T230659, though.

Jc86035 added a subscriber: Jc86035.EditedTue, Nov 5, 6:56 PM

How is the syntax to get into the page source? Is it going to be saved directly or is something like an extension tag going to generate the HTML?

I imagine it will be almost certain that some users (particularly experienced users) will continue to reply to comments by editing the whole page or section; I assumed throughout phase 2 that this choice would be left open, and this was the way that the phase 2 proposed direction was presented. The way that the options in the task description have been presented (I'm assuming that the attributes are going to be saved into the page source), because the software would have to add something like id=foo class=bar on page save for every new comment, the start and end of the comment would need to be automatically detected by the software in order for a "real" comment to be generated. The syntax could also pose problems e.g. with actual wikitext lists being used at the start of a comment.

I suggested during the community consultation that signatures could be modified to insert extension tags within or after the signature HTML. This would essentially create delimiters between comments, which would allow the software to distinguish different comments without matching timestamps. The id=foo etc. would be inside the extension tag, so the workflow would not change in this regard for users adding new comments through the section/page editing interfaces, and presumably the software could build all the styling and extra interface buttons around the delimited comments (e.g. replacing certain list item syntax while retaining list item syntax that isn't being used to delimit comments). It would still be possible to create dummy signatures without the extension tag by signing ~~~ ~~~~~.

I don't know how the underlying code works, so it wouldn't be up to me to determine what approach is the most feasible, but none of the presented options seem appealing to me just from a feasibility POV; users would presumably be quite annoyed if they'd have to substitute in a revision ID for every comment written in the 2010 editor.

Perhaps both a legacy syntax and a new syntax could be supported at the same time (if the new syntax is going to be sufficiently different to the original syntax), but this would be another can of worms and it's probably not worth discussing it further unless it's going to be seriously considered.

Jc86035 added a comment.EditedTue, Nov 5, 7:11 PM

More or less, an experienced user is going to want to be able to save this sort of comment through any of the available editing interfaces, regardless of the level that the comment is nested at, using the syntax that they learned years ago (or something very close to that).

* Text<ref>text</ref>
*; Text : text<ref>text</ref>
*; Text {{Smiley}}{{#invoke:Bananas|hello}}
*;: text<ref>test</ref>
Lorem ipsum;

lorem ipsum;

lorem ipsum.

{| class="wikitable"
! A !! B
| A || B

# This
# That
# The other<ref>text</ref>
Thus, lorem ipsum. ~~~~

Note that {{reflist-talk}} here is placed after the signature. I have done this before, several times; there isn't really a convention for doing so, but ideally this entire comment would be indented correctly without any issues and would not display badly. Of course, it's very possible that it might be infeasible to get the software to this point, but there are a lot of things that could be broken by a syntax change.

(Ideally, experienced users will also need to be able to get their comment indentation wrong and not have to fix it, because a lot of experienced users do get indentation wrong, especially if the discussion is one where the first-level nesting is done using bullet points; this is very common in e.g. RFCs, but editors tend to habitually do it in arbitrary discussions if stating their own opinions in succession. Right now, this is fine because it doesn't result in major visual hiccups, and it could plausibly be worked around by e.g. mandating semicolons for indentation for new comments, but I imagine a lot of older discussions may display very incorrectly if this is not taken into account. Additionally, community processes that use numbered lists, such as RFAs and other confirmation votes, will need to be taken into account in some way.)

Jc86035 added a comment.EditedTue, Nov 5, 7:42 PM

The syntax that I suggested back in February (during the phase 1 consultation) was:

[arbitrary wikitext] ~~~~

The number(s) would indicate the indentation level, and the list item after the number(s) would indicate some sort of comment styling (as opposed to being directly analogous to the current list item markers). Use of the new syntax would perhaps be optional (i.e. >4* could be omitted in place of **** or :::*). The metadata would be provided entirely by attributes within an extension tag in the signature (as proposed in T230653).

Note that there is an (optional) newline after the >4*. The obvious technical fault with most/all of the proposed syntax suggested in the task description is that it would be impossible to start comments with wikitext list items (or at least that would have to be changed), as those are only recognized as list items if preceded by a newline and optionally other list item marker characters.

From a usability perspective, this sort of syntax style is what I would prefer if I were to force myself to use the existing source editor to write comments.