Page MenuHomePhabricator

Sanitizer does not replace non-open closing td and tr tags.
Closed, ResolvedPublic

Description

Author: netocrat

Description:
Sanitizer::removeHTMLtags('Table not started</td></tr></table>')
returns:
Table not started</td></tr>&lt;/table&gt;

Expected:
Table not started&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

After applying the attached patch the function output is as expected.

[the above itself may be garbled due to the bug - I don't know if it's present
in bugzilla and if output of this description depends on it. To clarify if
that's the case: the pre-patch return only escapes the < and > delimiters for
the closing table tag; post-patch they are also escaped for the closing td and
tr tags]


Version: 1.6.x
Severity: trivial

Details

Reference
bz4373

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:00 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz4373.
bzimport added a subscriber: Unknown Object (MLST).

netocrat wrote:

Escape non-open td and tr tags in the sanitizer.

attachment sanitiser.td.tr.patch ignored as obsolete

netocrat wrote:

Comment on attachment 1231
Escape non-open td and tr tags in the sanitizer.

The check for a tag of "single" type is mistaken since even single types cannot
be closed as </singletag> when not already open.

netocrat wrote:

Correction to prevent the escaping of closing td and tr tags when legitimately open

This corrects the previous patch by allowing opening tr, td, li, dt and dd tags
to be added to the tagstack and thereby preventing their legitimate closing
tags from being escaped (an unintended side-effect of the previous patch).

In making this correction I found that all functional use of the $htmlsingle
array has been removed, so the behaviour it was supporting - allowing closing
tags for non-open td, tr, li, dt and dd elements to pass through unescaped -
appears to be by design. I can't see why that would be desirable but there may
be a reason that escapes me.

Attached:

avarab wrote:

FIXED in CVS HEAD, thanks for the patch

There's a parsertest for this bug called "Sanitizer: Closing of closed but not
open table tags"

netocrat wrote:

Cleans up the element arrays; apply over the previous (now accepted) patch

Since the corrective patch was accepted, here's a patch to clean things up on
top of it:

  • remove the now unused $htmlsingle array
  • use array_merge instead of repeating elements in arrays.

Btw, is normal process to submit a new bug, reopen this bug or submit the patch
by some other method? I've left this bug as closed since this patch doesn't
actually correct any visible behaviour or add new features.

Attached:

netocrat wrote:

Reopened and keywords need-review and patch added as I suspect the cleanup patch
will get lost in the woodwork without doing that.

Duping to bug 5497, where I've got a complete rewrite of the nesting handling.

  • This bug has been marked as a duplicate of 5497 ***