Page MenuHomePhabricator

problem with regex in BlockLevelPass::doBlockLevels
Closed, ResolvedPublic

Description

I recently wrote my own MediaWiki extension that sometimes returns inline SVG elements to be included in the page.

I noticed in certain cases that this would cause MediaWiki to stop generating <p> tags from that point onward in the document, and I tracked it down to a problem with a regular expression. Specifically, whenever the inline SVG had a <path> element in it, a regex in the doBlockLevels function would apparently treat it the same as a <p> tag.

I was able to fix the issue by making the regex a little more stringent. The following patch against latest git should do the trick:

--- BlockLevelPass.php	2019-11-26 11:31:34.000000000 -0700
+++ BlockLevelPass.php	2019-11-26 17:01:13.769113805 -0700
@@ -309,7 +309,7 @@
 				# @todo consider using a stack for nestable elements like span, table and div
 
 				// P-wrapping and indent-pre are suppressed inside, not outside
-				$blockElems = 'table|h1|h2|h3|h4|h5|h6|pre|p|ul|ol|dl';
+				$blockElems = 'table|h1|h2|h3|h4|h5|h6|pre|p[^a-z]|ul|ol|dl';
 				// P-wrapping and indent-pre are suppressed outside, not inside
 				$antiBlockElems = 'td|th';

I say "should" only because I'm working with version 1.27 and the code has moved around a little since then (and I haven't been able to test with latest git yet), but I'm pretty confident.

Here's a minimal SVG example for testing:

<svg height="200" width="200"><path d="M100 0 L0 200 L200 200 Z" /></svg>

Without my custom extension, I guess raw HTML would have to be enabled to see the bug, which is why I haven't been able to reproduce it on test2.wikipedia.org. In any case, the buggy behavior should show up if you add some paragraphs after the SVG element—they'll all run together without any <p> tags between.

Event Timeline

Umherirrender subscribed.

With T165817 a \b was added which should avoid your issues, it is part of REL1_30

Ah, thanks. Feeling dumb for not having spotted that (and of course it's a more comprehensive and proper fix).