Page MenuHomePhabricator

NULL in stack frame causes SVG to be unreadable
Closed, ResolvedPublic

Description

Trying to view https://performance.wikimedia.org/arclamp/svgs/hourly/2021-02-02_06.excimer-buster.api.svgz, Chromium reports:

This page contains the following errors:
error on line 2012 at column 23: Char 0x0 out of allowed range
Below is a rendering of the page up to the first error.

(It similarly fails to render on Firefox.)

The offending value is:

class@anonymous\0/srv/mediawiki/php-1.36.0-wmf.27/extensions/CirrusSearch/includes/CirrusSearch.php0x7fbdf64f0551::parse

The NUL separator between class@anonymous and the path is the problem, as that's not valid inside XML text.

It's not yet clear to me whether the NUL is supposed to be there or not. If it is, our options for fixing this are:

  1. Have Excimer convert it to a different value when emitting the frame.
  2. Have arclamp-log convert it to a different value when writing the log entry.
  3. Have flamegraph.pl escape it (possibly as <![CDATA[) when generating the SVG.

I'm leaning towards a combination of 1 and 2, assuming this isn't a bug in Excimer and/or PHP.

Event Timeline

Related, but not quite the same issue: T245295

Looks like the NUL is part of the name (according to PHP), and it also confuses things like exception stack traces: https://bugs.php.net/bug.php?id=74999

Excimer already has a special case for closures: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/php/excimer/+/refs/heads/master/excimer_log.c#319

...so I think it should probably do something similar for anonymous classes, on the assumption that non-ArcLamp users of Excimer will likely run into this issue as well.

Deploying a fix to either sanitize the arclamp-log output and/or quote the flamegraph.pl output is probably also needed, and will be a quicker/easier intermediate fix.

Possibly noteworthy that Speedscope seems to be able to render this OK: https://www.speedscope.app/#profileURL=https%3A%2F%2Fperformance.wikimedia.org%2Farclamp%2Flogs%2Fhourly%2F2021-02-02_06.excimer-buster.api.log

Ack, hot fix for arclamp.pl makes sense, to be upstreamed as well. Seems generally a good last place to tolerate/defend against it. Preferably tolerating as DWIM, but I can also see upstream preferring to reject/fail. But both are better than what it does now, which is basically invalid XML/SVG.

Change 664600 had a related patch set uploaded (by Dave Pifke; owner: Dave Pifke):
[performance/arc-lamp@master] Replace non-printable characters with '?'

https://gerrit.wikimedia.org/r/664600

Krinkle renamed this task from NUL in stack frame causes SVG to be unreadable to NULL in stack frame causes SVG to be unreadable.Feb 16 2021, 5:29 PM

Change 664600 merged by jenkins-bot:
[performance/arc-lamp@master] arclamp-log: Replace non-printable characters with '?'

https://gerrit.wikimedia.org/r/664600

Mentioned in SAL (#wikimedia-operations) [2021-02-22T16:26:28Z] <dpifke@deploy1001> Started deploy [performance/arc-lamp@1f3bce1]: Deploy ArcLamp fixes for T273565 and T273640

Mentioned in SAL (#wikimedia-operations) [2021-02-22T16:26:35Z] <dpifke@deploy1001> Finished deploy [performance/arc-lamp@1f3bce1]: Deploy ArcLamp fixes for T273565 and T273640 (duration: 00m 05s)

Krinkle triaged this task as Medium priority.Mar 18 2021, 12:18 AM

Remaining here, I think, is to submit a PR upstream?

Change 716119 had a related patch set uploaded (by Krinkle; author: Tim Starling):

[mediawiki/php/excimer@master] Filter null bytes out of the collapsed output

https://gerrit.wikimedia.org/r/716119

Change 716119 merged by jenkins-bot:

[mediawiki/php/excimer@master] Filter null bytes out of the collapsed output

https://gerrit.wikimedia.org/r/716119

Krinkle reassigned this task from dpifke to tstarling.

Change 664600 merged by jenkins-bot:
[performance/arc-lamp@master] arclamp-log: Replace non-printable characters with '?'

https://gerrit.wikimedia.org/r/664600

Reverted in https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/731130.

Closing for now as it seems like we don't need a change in FlameGraphs upstream right now. I've reported it for upstream anyway in case they want to support it: https://github.com/brendangregg/FlameGraph/issues/280.