Page MenuHomePhabricator

Merge Poem extension into MediaWiki core
Open, LowPublic

Description

The Poem extension has almost as much extension setup as it has functionality. Bundling it with MediaWiki is silly, it should simply be merged into core.


Version: 1.22.0
Severity: enhancement
Whiteboard: migrate Poem extension bugs to core post-resolution

Details

Reference
bz52061

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 1:49 AM
bzimport added a project: MediaWiki-General.
bzimport set Reference to bz52061.

I'm cool with merging it into core, although I probably wouldn't call it 'easy'. Anytime an extension is created or deprecated it's a logistical pain in the ass. Also, I would prefer to have this idea vetted on Wikitech-l before someone actually goes through the trouble of doing it.

(In reply to comment #1)

Anytime an extension is created or deprecated it's a logistical pain in the
ass.

Can you elaborate on this? It seems pretty easy to me.

Steps to deprecate an extension (apart from actual code changes):

  • Turn off extension for testwiki in InitialiseSettings.php and make sure nothing breaks (1 deployment cycle)
  • Turn off extension for all wikis in InitialiseSettings.php and make sure nothing breaks (1 deployment cycle)
  • Remove extension from CommonSettings.php, wmf-config/extension-list, and make-wmf-branch/default.conf (1 deployment cycle)
  • Move/deprecate MediaWiki documentation
  • Get component removed from Bugzilla, migrate any existing bugs
  • Have project removed from Gerrit (not immediately of course)

Poem was merged into core back in 2008, but subsequently reverted; see http://www.mediawiki.org/wiki/Special:Code/MediaWiki/43520.
I must say, however, that I totally agree with the original report and Poem (along with some obvious stuff like CheckUser, RenameUser, etc.) should be in core; maybe it should've been there from day one, even.

demon added a comment.Jul 26 2013, 2:42 PM

If you merge it to core, please call it something other than <poem>. Other than that I couldn't care less :D

(In reply to comment #5)

If you merge it to core, please call it something other than <poem>. Other
than that I couldn't care less :D

I think we must keep the name "<poem>" for backward-compatibility, but it should definitely have a saner alias ("<nobreak>" or something).

(In reply to comment #6)

(In reply to comment #5)

If you merge it to core, please call it something other than <poem>. Other
than that I couldn't care less :D

I think we must keep the name "<poem>" for backward-compatibility, but it
should definitely have a saner alias ("<nobreak>" or something).

The main name of the feature should be something else, <poem> can be an alias for backwards-compatibility.

TTO added a comment.Dec 28 2013, 8:53 AM

OK, I'm taking a look at this. The merge looks easy enough, but I need advice.

Poem has two major nomenclature issues:

  1. The tag should have another name besides <poem>. In comment 6 <nobreak> was suggested, but Poem does the reverse - it preserves line breaks. I am thinking something like <verbatim>, <asis> ("as is"), <nocollapse>, <pretext>/<pre-text> (as in <pre> but for text, not code), <keepbreaks> ......... ?? Any better ideas?
  1. The attribute "compact" seems to actually make the display less compact. See the testcase at [[mw:Extension:Poem]]. I think we need a new name/alias for this too. However, I have no idea what it is actually meant to be used for...

(In reply to comment #8)

I am thinking something like <verbatim>, <asis> ("as is"), <nocollapse>,
<pretext>/<pre-text> (as in <pre> but for text, not code), <keepbreaks>
[...]

Okay, I'll paint this bikeshed. I like verbatim or pretext. How about one of those?

  1. The attribute "compact" seems to actually make the display less compact.

See the testcase at [[mw:Extension:Poem]]. I think we need a new name/alias for
this too. However, I have no idea what it is actually meant to be used for...

I'm not sure "compact" is needed... do people use it? If so, are there examples of it in the wild?

The compact feature has always seemed quite buggy (or at least bizarrely unintuitive) and I wonder if we would really hurt much by killing it altogether.

Regarding the tag name, is <poem> ever actually used for things besides poems? I usually use <pre> for other cases.

Change 106861 had a related patch set uploaded by TTO:
Merge Poem extension into core

https://gerrit.wikimedia.org/r/106861

TTO added a comment.Jan 11 2014, 9:21 AM

I'm wondering if people would prefer <verbatim> as the name of this tag?

I'd also prefer <verbatim> as the name.

Sure, let's go with verbatim.

Siebrand also noted on Gerrit change 106861 that if/when this gets merged, we'll need to migrate the Poem extension bugs to core.

(In reply to comment #10)

Regarding the tag name, is <poem> ever actually used for things besides
poems?

I looked at some examples in de.wikipedia. Apart from poems and poem-like texts (prayers etc.) there are things like inscriptions and dialogs, which use <poem>. And of course, as it is a wiki, some stupid things like quotes even without any line break in them.

<verbatim> is a misleading name since it still parses wikitext in its content. It is only verbatim for text-content.

From what I can tell, looking at output, this is "indent-pre" - "indent" + CSS-class. Given that indent-pre provides all the formatting functionality, I presume the missing ability of adding CSS-classes onto indent-pre is the primary reason for this new tag? Is there another way of achieving this without adding a new tag?

(In reply to comment #16)

From what I can tell, looking at output, this is "indent-pre" - "indent" +
CSS-class. Given that indent-pre provides all the formatting functionality, I
presume the missing ability of adding CSS-classes onto indent-pre is the
primary reason for this new tag? Is there another way of achieving this
without adding a new tag?

I don't know what you mean by indent-pre.

I created a test page at [[testwiki:bug 52061]] to demonstrate the poem tag's functionality.

Please, don't use <verbatim>!!. It's already been in use by a Wikia extension [1]


[1] http://community.wikia.com/wiki/Help:Verbatim_tags

(In reply to comment #17)

I don't know what you mean by indent-pre.

Subbu clarified that this means the preceding space trick. I updated [[testwiki:bug 52061]].

I further updated [[testwiki:bug 52061]] to show the difference between <poem> and <pre class="poem">. IMO the <poem> tag should be rendered as <pre class="poem"> and it has exactly the behavior of indent-pre (with an extra HTML class and without the necessity to write the leading indentation).

I agree reg. output being <pre class='poem'> .. </pre>

However, maybe make it possible for the editor to provide a css-class rather than hardcode class="poem" into the tag?

For standard <poem> content,
<pre-text>
...
</pre-text>

And for other uses,
<pre-text class="fancy-css">
...
</pre-text>

TTO added a comment.Jan 25 2014, 12:04 AM

(In reply to comment #16)

<verbatim> is a misleading name since it still parses wikitext in its
content.
It is only verbatim for text-content.

I do see what you mean here. The problem is, we have so far been unable to come up with a better name! <poem> is inappropriate as too specific/narrow.

Is there another way of achieving this
without adding a new tag?

The <poem> tag is already in wide use. I don't see the problem with adding a tag for this functionality.

(In reply to comment #20)

IMO the <poem> tag should be rendered as <pre
class="poem"> and it has exactly the behavior of indent-pre (with an extra
HTML
class and without the necessity to write the leading indentation).

A few questions about this statement:

  • The gray box/monospaced font of <pre>/indent-pre would have to be removed for <poem> blocks.
  • What are the existing differences between <poem> and indent-pre?
  • <pre> does not render identically to indent-pre, so could you explain how rendering poems using <pre class="poem"> would help?

(In reply to comment #22)

(In reply to comment #20)

IMO the <poem> tag should be rendered as <pre
class="poem"> and it has exactly the behavior of indent-pre (with an extra
HTML
class and without the necessity to write the leading indentation).

A few questions about this statement:

  • The gray box/monospaced font of <pre>/indent-pre would have to be removed

for

<poem> blocks.
  • What are the existing differences between <poem> and indent-pre?
  • <pre> does not render identically to indent-pre, so could you explain how rendering poems using <pre class="poem"> would help?

We weren't very clear about what we were discussing there.

What Scott meant is that the patch should generate <pre class="poem"> .. </pre> for the HTML of the <verbatim>..</verbatim> (or <pre-text> or whatever tag name is picked) instead of generating <div class="poem">..</div>. It is cleaner and the ouput is semantically closer to the intention of the <verbatim> tag.

TTO added a comment.Jan 25 2014, 12:22 AM

What you are saying makes sense. Presumably the CSS for .poem would have to "neutralise" the monospace/gray formatting normally applied to pre tags.

My main question is, if we output a <pre> tag from the tag hook, will wikitext inside that tag still get parsed, as it should be?

That is right reg. CSS.

Since this is going to be merged into core/php-parser, presumably, you control this behavior and where in the parser pipeline the outer wrapping <pre> tag is generated? I haven't looked at the parser changes to answer that or how easy this is since I haven't really hacked the php parser. Maybe Scott can answer that more readily.

At least with Parsoid, that is what we will consider outputting for this new tag unless VE folks find a <div> tag simpler than a marked up <pre> tag (seems unlikely).

TTO added a comment.Jan 25 2014, 12:50 AM

(In reply to comment #25)

Since this is going to be merged into core/php-parser, presumably, you
control
this behavior and where in the parser pipeline the outer wrapping <pre> tag
is
generated? I haven't looked at the parser changes to answer that or how easy
this is since I haven't really hacked the php parser. Maybe Scott can answer
that more readily.

I think I've got this part sorted.

(In reply to comment #22)

(In reply to comment #16)

<verbatim> is a misleading name since it still parses wikitext in its
content.
It is only verbatim for text-content.

I do see what you mean here. The problem is, we have so far been unable to
come
up with a better name! <poem> is inappropriate as too specific/narrow.

What about <lines>? The tag just just preserves the lines in its content, and it seems generic enough for what is used.

TTO added a comment.Jan 25 2014, 11:13 AM

I like <lines>, the name puts the focus on the individual "lines" of the content, which is also what the tag is doing. Nice lateral thinking, Michael :)

lines sounds good, or <preservelines>

That said, maybe it would be more user friendly to use a mediawiki.org page to discuss the tag name, so ideas can be better exposed and have more opinions (remember one have to explicitly register an account here to post a comment).

About <poem> itself, I don't know why is a different name really needed. What are other use cases for this? I've just only used it for poem and lyrics (which is basically the same).

Regarding my use of "indent-pre" above -- I'm looking at this mostly from the perspective of parser maintenance. As [[testwiki:bug 52061]] demonstrates, the desired behavior of <poem>/<lines> is the same as the existing behavior of 'indent-pre', although the implementation in the original patch was very different. I would like to try to avoid adding a bunch of new hair (and crazy interactions) with the new feature, and instead try to reuse the existing indent-pre handling code so that all the corner cases were consistent.

In fact, assuming this gets merged, it might be cleaner in the future to document the "indent pre" behavior as "an implicit <lines> surrounding the indented region". This would reduce confusion because "indent pre" and "<pre>" are actually quite different in their handling of embedded wikitext, etc.

I like the name <lines>, btw.

I think I just saved this patch from a premature +2.

I'd like to summarize what still needs to be done here:

  1. In comment 29, Jesús Martínez Novo made a good suggestion to more widely publicize the new <lines> name before it gets committed. We almost inadvertently introduced a conflict with <verbatim>, I'd like some greater certainty that the new <lines> name proposal has been seen by a larger number of editors/extension authors. A mailing list thread might be a good way to kick off this discussion.
  1. In the comments on the patch, '----' was brought up as the last remaining way that the current <lines> implementation differs from 'indent-pre' handling. I'd like to resolve this discrepancy: I really really *really* don't want to complicate the parser more with Yet Another Slightly Different Way to handle preformatted content.
  1. I'd also like to review the code coverage issues a little more, to ensure that we're sharing code (and CSS rules) to the maximum extent possible. This is to ensure that point #2 continues to hold in the future, and we don't accidentally introduce subtle incompatibilities by changing only one side of a code path.

Does anyone else have any remaining issues they'd like to see resolved before this is merged?

I added a few more test cases to [[testwiki:bug 52061]] -- leading colons are also parsed differently inside <poem>.

From chat:

(07:32:52 PM) subbu: i just checked the updated wiki page .. and <hr> and colon-indents seem different in <poem> vs. indent-pre .. probably all wikitext elements that occur in SOL position but are strict about not having any leading ws.
(07:33:16 PM) cscott-free: how many of those are there?
(07:33:36 PM) subbu: i suppose <poem> just calls regular parse on embedded content and then wraps it in wrapper with the right css classes
(07:33:43 PM) subbu: tables are fine with leading space.
(07:34:05 PM) subbu: headings, lists, hr dont work if there is leading WS, iirc.
(07:34:41 PM) subbu: so, we were wrong .. <poem> is not indent-pre with classes then.
(07:35:10 PM) cscott-free: but should it be? i don't like the idea of adding a bunch of new special cases to the parser.
(07:35:48 PM) subbu: cscott, it cannot be without breaking <poem> behavior OR breaking existing pages that have leading space before :*;#=- chars
(07:35:59 PM) cscott-free: if we can at least parameterize it, such that indent-pre has SOL markers turned off, and <poem> has SOL markers turned on, that would make me happier.
(07:36:14 PM) cscott-free: rather than poem just being gratuitously different in random ways.
(07:36:24 PM) subbu: ya, you are talking of refactoring code in the parser.
(07:36:30 PM) subbu: and reusing it.
(07:36:56 PM) cscott-free: subbu: i'm certainly talking about reusing more parser code, rather than implementing <poem> as a brand new crazy thing.
(07:37:25 PM) cscott-free: i'm tired of having weird corner cases everywhere.
(07:37:50 PM) cscott-free: there should be some principled way we can describe how indent-pre and <poem> are similar.
(07:37:57 PM) subbu: maybe spend some time, how <poem>/<lines> would fit in parsoid land and that can inform the discussion of how the core code can be restructured .. but refactoring/restructure that seems like a lot of work as well.
(07:38:33 PM) cscott-free: in a nutshell that's why i don't want <poem> merged too quickly. i'd like to understand better how it all fits in.
(07:38:53 PM) cscott-free: and probably to do some dumpGrepping to figure out how much we can change indent-pre and/or <poem> without breaking too much stuff.

TTO added a comment.Jan 28 2014, 2:18 AM

(In reply to comment #31)

I'd like to resolve this discrepancy: I really really *really* don't want to
complicate the parser more with Yet Another Slightly Different Way to handle
preformatted content.

There's nothing really new here :) We're just sending a <pre> tag to the parser, after all. The change to the parser itself is minuscule, and arguably it is a fix for a long-standing bug that was never exposed until now.

(In reply to comment #33)

(07:34:41 PM) subbu: so, we were wrong .. <poem> is not indent-pre with
classes then.

I'm glad you're aware of this! As I said in Gerrit, the similarities are purely coincidental.

Instead of finding similarities between poem and indent-pre, which is moot since the indentation already breaks all markup that depends on being on the first column of the line, we may look at similarities between poem and normal wikitext.

It's practically the same. The only difference is that poem treats newlines as <br/>, while normal text simply ignore newlines or convert them into paragraphs when there's a combination of newlines.

(In reply to comment #31)

  1. In comment 29, Jesús Martínez Novo made a good suggestion to more widely

publicize the new <lines> name before it gets committed. We almost
inadvertently introduced a conflict with <verbatim>, I'd like some greater
certainty that the new <lines> name proposal has been seen by a larger number
of editors/extension authors. A mailing list thread might be a good way to
kick off this discussion.

Will a post to wikitech-l suffice? Or is more needed in your opinion? I can probably handle the communication portion of this, just please be explicit with what you would consider sufficient.

  1. In the comments on the patch, '----' was brought up as the last remaining

way that the current <lines> implementation differs from 'indent-pre'
handling.

Okay, I figured out what you mean here. At [[testwiki:Bug 52061#Indent-Pre ("Preceding space")]] a "----", the wikitext equivalent of a horizontal rule (<hr>), is not current expanded when there's a leading space.

In my opinion, this behavior (preceding space + unexpanded horizontal rule) is a separate issue that should be filed as a separate bug report (if one doesn't exist already) and is not a blocker to merging Gerrit change 106861. If you feel differently, can you please elaborate? (Also, can you please search for a relevant bug report and/or file one?)

  1. I'd also like to review the code coverage issues a little more, to ensure

that we're sharing code (and CSS rules) to the maximum extent possible. This
is to ensure that point #2 continues to hold in the future, and we don't
accidentally introduce subtle incompatibilities by changing only one side of
a code path.

Sure, code review is ongoing at Gerrit change 106861. If you have additional suggestions for technical changes, feel free to leave them on Gerrit. :-)

TTO added a comment.Feb 8 2014, 4:50 AM

(In reply to comment #31)

  1. In comment 29, Jesús Martínez Novo made a good suggestion to more widely

publicize the new <lines> name before it gets committed.
A mailing list thread might be a good way to
kick off this discussion.

I'll post one.

  1. In the comments on the patch, '----' was brought up as the last remaining

way that the current <lines> implementation differs from 'indent-pre'
handling.
I'd like to resolve this discrepancy: I really really *really* don't want to
complicate the parser more with Yet Another Slightly Different Way to handle
preformatted content.

Let me expand a bit here: <poem> is designed for the treatment of poetry. I don't know what indent-pre was designed for, but one assumes it was intended for general-purpose, preformatted text, such as command line outputs and plain-text tables. The two use-cases are somewhat different.

I went through and found the following differences. There are probably many others.

  • For some reason, indent-pre doesn't handle ----. Presumably this is so preformatted tables, etc (e.g. those copied out of the MySQL monitor) still appear correctly. This consideration is not relevant for <poem> and so ---- is parsed inside these tags. Indeed, ---- within <poem> is quite common at Wikisource.
  • Indents using : are ignored within indent-pre (presumably because they don't make a whole lot of sense in a preformatted context). In <poem>, indents are handled specially (using a fixed-width <span> to give 1em indentation per : character), and line-initial spaces are transformed to &nbsp;, to cope with the special formatting of some poetry.
  • Wikitext {| |} tables are available within <poem>, whereas indent-pre is aborted as soon as a table is found.
  • Indent-pre formatting is aborted whenever a block-level tag is encountered on the line. Block-level formatting is, however, available inside <poem> (although this is only useful for stuff like <table>, since the appearance of tags like <p> is affected quite badly by the <pre> environment).
  • Indent-pre is not available inside <blockquote>, for unclear reasons. <poem> works fine inside <blockquote>, just as anywhere else.
  • <poem> allows the application of a custom class/ID/style. The use of a tag also allows extra attributes to be used - I have a patch in the works that adds line numbering support to <poem>, which makes use of several additional tag attributes (see bug 13644).

I don't think it is correct to speak of "resolving" these discrepancies. As you can see, most of them seem to exist for a reason, and most involve idiosyncrasies in the existing indent-pre behaviour, any changes to which would disrupt existing wikitext.

(Note, when I talk about <poem> above, I'm talking about the new implementation in Gerrit. The old extension sometimes does things differently.)

  1. I'd also like to review the code coverage issues a little more, to ensure

that we're sharing code (and CSS rules) to the maximum extent possible.

I think this is up to you, Scott, and others in the know.

555 added a comment.Feb 10 2014, 4:14 PM

A Global Message Delivery to all Wikisource Scriptoriums/Village pump is higly recommended, since this extension was firtly developed to Wikisource wikis and latter enabled on all Wikimedia wikis [1] [2].

In the time that <poem/> tag arised on Wikisources, we have also asked for a <prose/> tag (See bug 6810).

Unfortunately at that time the developers picked what was most easy to then instead of what is more user intuitive (sorry, but <poem>foo</poem> and {{prose start}}foo{{prose end}} don't makes sense to newcomers (#c6).

As you can see on [3] and [4], the formatting isn't the same currently adopted in <poem/> tag, so the current consensus to set <poem/> as an alias for a new tag name isn't viable.

[1] - https://wikitech.wikimedia.org/wiki/Server_admin_log/Archive_7#June_23

[2] - https://wikitech.wikimedia.org/wiki/Server_admin_log/Archive_10#February_7

[3] - https://en.wikisource.org/wiki/Template:Prose

[4] - https://fr.wikisource.org/w/index.php?title=MediaWiki:Common.css (classe prose : PERIMÉE, utiliser text )

[[:m:User:555]]

TTO added a comment.Feb 10 2014, 11:54 PM

(In reply to comment #38)

In the time that <poem/> tag arised on Wikisources, we have also asked for a
<prose/> tag (See bug 6810).

That is an entirely separate issue. The presence or absence of a <prose> tag is not going to affect the resolution of this bug.

As you can see on [3] and [4], the formatting isn't the same currently
adopted
in <poem/> tag, so the current consensus to set <poem/> as an alias for a new
tag name isn't viable.

I don't understand your point here. What has this got to do with the <poem> tag?

555 added a comment.Feb 11 2014, 12:19 AM

The <poem/> tag usage on texts that does not contains poems, only the need to parse line breaks in a different behaviour, isn't the feature itself provided by the extension, only a very interesting side effect.

It looks to me like in very old MediaWiki versions without file undeletion support: media deletion at that time resulted in less disk space being used. But that aren't the feature, only a side effect.

IMHO before merging Extension:Poem into core it firstly needs to addresses all expected behaviours in their original expected feature, ie to provide quickly and in easy ways alternatives to format text portions based in a given form of language.

TTO added a comment.Feb 11 2014, 12:40 AM

(In reply to comment #40)

The <poem/> tag usage on texts that does not contains poems, only the need to
parse line breaks in a different behaviour, isn't the feature itself provided
by the extension, only a very interesting side effect.

Please read the preceding comments in this bug report. The reason why this tag is going to be made available under the name <lines> is to emphasise that is not only used for poems, but for other texts that require the preservation of line breaks. This is not merely a "side effect", but the key feature of this tag.

IMHO before merging Extension:Poem into core it firstly needs to addresses
all
expected behaviours in their original expected feature,

A merge of an extension into core, means taking the extension's existing functionality, and placing it in MediaWiki core. It does not mean we add extra features at the same time.

The Poem extension only provides poem-related (line break preservation) functionality, so that is all that core is getting in this merge. This is not the place to discuss other "original expected features".

TTO added a comment.Feb 14 2014, 7:34 AM

(In reply to comment #37)

(In reply to comment #31)

  1. In comment 29, Jesús Martínez Novo made a good suggestion to more widely

publicize the new <lines> name before it gets committed.
A mailing list thread might be a good way to
kick off this discussion.

I'll post one.

http://lists.wikimedia.org/pipermail/wikitech-l/2014-February/074337.html

No replies, so I assume no-one is concerned.

(In reply to comment #38)

A Global Message Delivery to all Wikisource Scriptoriums/Village pump is
higly recommended

MZMcBride, could you advise how feasible this would be?

Tpt added a comment.Feb 14 2014, 8:37 AM

As I believe that <poem> tag will be kept as an alias forever, it may be interesting to make <poem> and <lines> semantic a little bit different:

  • <lines> would be presentational only and the output will have the "lines" class as in <pre class="lines">
  • <poem> has the same presentational effect as <lines> but use the "poem" class to specify that the content is a poem as in <pre class="lines poem">

It won't break sites CSS because the current pages only use the poem tags. If wikis wants to use their custom CSS for <poem> tag with all <lines> tags, they will only have to change "poem" by "lines" in their rules.

TTO added a comment.Feb 14 2014, 9:01 AM

That's a very good suggestion, Tpt. I would be inclined to call the CSS class something more like "mw-lines" or similar, to comply with modern class naming standards.

However I wonder if it is really a good idea to provide differing functionality, however minor, between <poem> and <lines>?

Tpt added a comment.Feb 14 2014, 9:09 AM

Yes, you have right with "mw-lines".

I understand your concern with differing functionality that would require two buttons in the editing interface (it's the major concern I see). I think we should have feedback from the VisualEditor team.

Change 106861 abandoned by Krinkle:
Merge Poem extension into core

Reason:
Closing for now to clear pending review dashboard. Feel free to re-open when you can work on it.

https://gerrit.wikimedia.org/r/106861

Change 106861 restored by Legoktm:
Merge Poem extension into core

Reason:
I think this is a valuable changeset that should be pursued.

https://gerrit.wikimedia.org/r/106861

(In reply to Tpt from comment #45)

Yes, you have right with "mw-lines".
I understand your concern with differing functionality that would require
two buttons in the editing interface (it's the major concern I see). I think
we should have feedback from the VisualEditor team.

James, do you/your team have an opinion?

(In reply to Kunal Mehta (Legoktm) from comment #48)

(In reply to Tpt from comment #45)

Yes, you have right with "mw-lines".
I understand your concern with differing functionality that would require
two buttons in the editing interface (it's the major concern I see). I think
we should have feedback from the VisualEditor team.

James, do you/your team have an opinion?

I don't particularly mind it being merged into core from an "editing" POV, but it is "yet another feature" in MediaWiki core which we're meant to be making leaner, faster and more flexible, which this roughly feels like it goes against.

The Parsoid-land changes aren't too hard to handle, I believe, so that shouldn't block this.

(In reply to James Forrester from comment #49)

I don't particularly mind it being merged into core from an "editing" POV,
but it is "yet another feature" in MediaWiki core which we're meant to be
making leaner, faster and more flexible, which this roughly feels like it
goes against.
The Parsoid-land changes aren't too hard to handle, I believe, so that
shouldn't block this.

It's also another one of those features that will come in at least slightly handy on almost every wiki, though. It's deployed on a significant percentage of wikis, and on those on which it isn't deployed, the users probably wish it were deployed when they run into situations that call for its use. https://wikiapiary.com/wiki/Extension:Poem

Admittedly, it's not as important as, say, ParserFunctions or Cite (both of which should probably be merged to the core, by the way). But it's not very harmful, either.

(In reply to comment #42)

(In reply to comment #38)

A Global Message Delivery to all Wikisource Scriptoriums/Village pump is
higly recommended

MZMcBride, could you advise how feasible this would be?

I think a mention in [[m:Tech/News]] is sufficient here. Nothing is breaking, as I understand it, so this news is purely informational: "In addition to <poem>, you can also now use <some other tag name>."

If you really want to send a global message, you can use [[m:MassMessage]].

Change 106861 had a related patch set uploaded (by TTO):
Merge Poem extension into core

https://gerrit.wikimedia.org/r/106861

Patch-For-Review

GOIII added a subscriber: GOIII.EditedJan 5 2015, 10:25 AM

Merge Poem extension into MediaWiki core

Well now.... I've found the most interesting patch to watch for 2015. If $paragraphStack is going to have a [greater] role when it comes to Poem's application - I'm sure no harm can ever come of it.
</sarcasm off>

Seriously - I hope whenever this move goes through, its given a week of vetting on test/test2 before its deployed to the usual suspects in a formal build. And, while I'm of like mind in the assessment it should have never existed in the manner that it has to date & the Lines alternative would be the best route to follow - there are sooo many people that depend on it on an almost daily basis just on Wikisource alone that I would not have the nuts to screw around with it like this.

Truthfully, the issues with the tag to date have mostly been over the User:s response on occasion to the parser's <p><br /></p> effect compounded by the esoteric formatting typically done via template are both really where the ''blame'' should go than anything found in the extension itself. Admittedly, the choice not to force the opening and closing poem tags to always reside on their own lines upon save & rendering starting on day-one with it's introduction sure hasn't helped matters either.

I implore you to mirror .mw-bodys definition of P rather than the one(s) found in Elements.css -- those values are all long usurped by the time the page finishes loading its 1st click-in of the session. You'll find that 0.1em less between the two definitions in margin-top is hard to catch initially... until the 1.5em line-height setting is replaced [again?] -- setting it [back?] to inherit that is.

The last thing that comes mind would be more of an enhancement (like in T15644) than a bug . Had Poem been crafted to force it's tags to their own line/line-starts, the closing tag of the previous stanza would ''land'' right above the opening tag of the stanza that followed more often than not. This would be desirable because the ''gap'' between verses or stanzas are rarely as small as the predefined paragraph's are. Contributor's frequently came up with all sorts of ways to use empty inline-blocks to ''handle'' not only stanza spacing; but indentation and eol padding as well. Making attribute manipulation easier through the opening (margin-top) or closing (margin-bottom) tags of Line itself could dramatically cut down on stuff like that.

Krinkle removed a subscriber: Krinkle.Jan 5 2015, 4:54 PM
Paladox set Security to None.Sep 11 2015, 8:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 11 2015, 8:54 PM

From Parsoid's perspective, the biggest problem with <lines> is unclear semantics.

I suggest that this patch go through the RFC process first. Let's write up a document which describes exactly how we want <lines> to behave. We should also discuss deployment strategy in the RFC -- should we turn off support in core until Parsoid and VE also support it? Or implement stopgap support which doesn't allow editing sections which contain <lines> (treating it an an "unknown extension"), or what?

And what is our strategy for migrating existing <poem> content to <lines>? Is there an automatic translation? Or are some details of the semantics going to trip us up when we try to convert existing content?

As @GOIII states, our paragraph handling in the wikitext parser is already extremely complex, and there are a lot of existing workarounds which depend on details of <poem> in order to obtain specific effects in poetry. If we're going to create a new feature, let's make sure it can actually handle all the things that folks want it to do, without crazy hacks.

In particular, I'm interested in hearing more from @GOIII about:

force the opening and closing poem tags to always reside on their own lines

and

ways to use empty inline-blocks to ''handle'' not only stanza spacing; but indentation and eol padding as well. Making attribute manipulation easier through the opening (margin-top) or closing (margin-bottom) tags of Line itself could dramatically cut down on stuff like that.

Lets get a spec for <lines> written down in such a way that we can discuss issues like this:

  • Should the tags be forced to be on their own lines? (The current implementation doesn't force this.)
  • Can you write <lines class="foo">? Does this override the default "poem" and "mw-lines" classes, or add to them? Is this sufficient for the different styling @GOIII describes? (The current implementation doesn't allow extra classes.)
  • What should the Parsoid representation be?
  • What if somebody writes <pre class="poem mw-lines"> in wikitext? That's valid wikitext. Should Parsoid try to re-serialize this as a <lines> tag?
  • Generally styles hard-coded by mediawiki core are prefixed. The poem class is not.
  • If <poem> is aliased to <lines>, are we going to break all of wikisource overnight?
  • Is a recursive parse strategy really appropriate? It means that <lines> and <pre> are going to grow additional inconsistencies on the edge cases. For example, '''foo<lines>bar'''bat could get rather confused about its <b>s.
GOIII added a comment.Sep 29 2015, 1:24 AM

In particular, I'm interested in hearing more from @GOIII about:

. . .

Lets get a spec for <lines> written down in such a way that we can discuss issues like this:

  • Should the tags be forced to be on their own lines? (The current implementation doesn't force this.)

Depends. Will the new tag mimic the standard behavior of a <DIV> attribute's display:block or the behavior of a <P> attribute's display:block under the wiki mark-up/parser/parsoid?

The whole damnblasted issue comes back first and formost to the auto insertion of an opening <P> tag at every seemingly first detected line start. That's the current quirk; whether or not content placement starts after or under the poem tag. In other terms using the basic underlying html when it comes to the <poem> tag, it's ....

<poem>First line
<p>second line</br>
third line</br>
last line</p></poem>

... versus ...

<poem>
<p>First line</br>
second line</br>
third line</br>
last line</p>
</poem>

The 2nd version is closer to reflecting normal/expected behavior based on the "substitution" of <poem> for <div> but its still not quite right because of the attributes <p> automatically follows as defined elsewhere by some number of .css files 'handed down' by mediawiki by default. The optimal "underlying" html should behave as a true DIV tag (display:block) should and there should be no <P> element or behavior anywhere in play here.

Taking the same two possible placements of content start (immediately after or immediately under) as above but substituting <div> for <poem>, the "underlying" html should go something like this ...

<div class=poem>First line</br>
second line</br>
third line</br>
last line</br></div>

... versus ...

<div class=poem>
First line</br>
second line</br>
third line</br>
last line</br>
</div>

Obviously, consensus still needs to be reached whether or not there should be a </br> at the end of the 'last line' found immediately before the closing tag, but regardless of after or under placement by the contributor, both render the same. The takeaway here should be killing the presence and/or behavior of that mark-up induced opening <p> tag for any of this to truly be worthwhile imho.

  • Can you write <lines class="foo">? Does this override the default "poem" and "mw-lines" classes, or add to them? Is this sufficient for the different styling @GOIII describes? (The current implementation doesn't allow extra classes.)

Imho, manually adding a class= statement should add or enhance the default.

  • What should the Parsoid representation be?

Not sure what exactly you're looking for here

  • What if somebody writes <pre class="poem mw-lines"> in wikitext? That's valid wikitext. Should Parsoid try to re-serialize this as a <lines> tag?

Probably not as the <pre> tag introduces background-color and borders which is not likely to be in high demand as far as Wikisource goes. I can't speak for other projects however.

I'm not even sure the current <pre> scheme follows the "new" css3 properties of white-space: pre-wrap or white-space: pre-line for that matter.

  • Generally styles hard-coded by mediawiki core are prefixed. The poem class is not.

I would hope the accepted practices and/or appropriate procedures would dictate this.

  • If <poem> is aliased to <lines>, are we going to break all of wikisource overnight?

Building an all encompassing testbed on test2.wikipedia.org to experiment against prior to roll out seems like the logical way to address such concerns in my view.

  • Is a recursive parse strategy really appropriate? It means that <lines> and <pre> are going to grow additional inconsistencies on the edge cases. For example, '''foo<lines>bar'''bat could get rather confused about its <b>s.

I'd need more clarity re: the direction ultimately taken regarding the earlier points to reply here

Restricted Application added a subscriber: Luke081515. · View Herald TranscriptJan 20 2016, 8:20 PM
Meno25 removed a subscriber: Meno25.Feb 19 2016, 5:41 PM
TTO removed TTO as the assignee of this task.Dec 23 2016, 6:24 AM
TTO lowered the priority of this task from Normal to Low.

I'm not really interested in this anymore. Instead, I might try getting the automatic line numbering functionality requested in T15644 into the Poem extension, which would make the Poem extension a bit bigger and more worthwhile.

Change 106861 abandoned by TTO:
Merge Poem extension into core

Reason:
I don't intend to pursue this anymore. However, reviewers here might be interested in https://gerrit.wikimedia.org/r/328887 and https://gerrit.wikimedia.org/r/328888 .

https://gerrit.wikimedia.org/r/106861

This zombie bug has lurched to life again because of T188478: Parsoid needs a native implementation of <poem>.

Any chance we can come up with a sane semantics for <poem>/<lines> so that we aren't plunged into eternal darkness by trying to make a Parsoid extension bug-for-bug compatible with <poem>? We have fancy new linting tools that (in theory) could help auto-migrate uses of <poem> if we can agree on a better replacement.

TTO added a comment.EditedMar 17 2018, 12:02 PM

I think https://gerrit.wikimedia.org/r/106861 is probably still relevant today (or if there is no appetite for merging into core, then the changes can be rebased on top of the extension, as seen in https://gerrit.wikimedia.org/r/#/c/328887/). I seem to recall that it fixes a number of bugs in the Poem implementation, not all of which are tracked in Phabricator. Although this was 4 years ago, so I might be wrong.

TTO added a comment.Mar 17 2018, 12:05 PM

Also, it would be nice if we could have some more attention on https://gerrit.wikimedia.org/r/330896/. This is an obscure bug in the PHP parser that is only triggered by certain wikitext content generated by a tag extension (it can't be triggered by user-entered wikitext), and it blocks the re-implementation of <poem> using the <pre> tag.

cscott added a comment.Apr 4 2018, 7:53 PM

So, looking at https://gerrit.wikimedia.org/r/328887 it seems that, if that patch is merged, the primary difference between <poem> and <pre> handling in core is:

  • <poem> generates an empty span with margin-left:1em for wikitext like : foo. But I think we could probably tweak the CSS so that the rendering of the standard parser output for :foo (aka, <dl><dd>foo</dd></dl> is the same.
  • <poem> respects leading whitespace, but compresses intraline whitespace. This is currently being done by a combination of CSS and manually substituting leading whitespace with &nbsp;. In order to clean this up, the easiest thing would be to stop compressing intraline whitespace. We'd either do that for a new <lines> tag (and merge the <lines> semantics into core) or else do a wikilint type project to simply removing inadvertent intraline whitespace. Depending on how prevalent it is, the latter might not be too hard.

What else am I missing?

TTO added a comment.Apr 5 2018, 8:22 PM

We had an IRC discussion in #mediawiki-parsoid about this.

  • We'll get rid of the special handling of :, but preserve the existing appearance by using CSS. So end users won't notice a change, unless they have custom CSS rules with selectors like div.poem > span or .mw-poem-indented, which will not work anymore. This will address T123216.
  • The compact attribute is going to be killed. (We didn't actually discuss this, but everyone appears to agree that it is useless.) This will address T16910.
  • We'll try to solve the T124144 <hr> issue as well. Existing patches for that task will be insufficient to deal with the problem when <pre> is used.
  • Once those things are done, @ssastry suggested to follow this plan:
    • get feedback from Wikisource
    • do visual diffs if required as part of that
    • merge the patch https://gerrit.wikimedia.org/r/#/c/328887/
    • evaluate if there is anything we want to deprecate via linter (e.g. intra-line whitespace)
    • reevaluate whether we want to merge this into core as <lines> as the original proposal stood

This is arguably the wrong task for this work, since we are not actually merging the extension (yet) - the current flurry of work is a precursor to that.