Page MenuHomePhabricator

Remex doesn't uppercase tag/node names
Open, Needs TriagePublic

Description

According to the DOM standard (eg https://dom.spec.whatwg.org/#dom-element-tagname and https://developer.mozilla.org/en-US/docs/Web/API/Element/tagName) the Node#tagName (and thus also Node#nodeValue should be uppercase:

The tagName attribute’s getter must return the context object’s HTML-uppercased qualified name.

Remex -- and the standard PHP DOMDocument#loadHTML method -- use lowercase tag- and node names:

$ psysh
Psy Shell v0.9.9 (PHP 7.3.2-3 — cli) by Justin Hileman
>>> require 'vendor/autoload.php';
=> Composer\Autoload\ClassLoader {#2}
>>> ($html = file_get_contents( 'obama.html' )) || true;
=> true
>>> ($doc = new DOMDocument) || true;
=> true
>>> $doc->loadHTML($html);
>>> require('./tests/ZestTest.php')
=> 1
>>> $doc2 = \Wikimedia\Zest\Tests\ZestTest::loadHtml("./obama.html"); /* uses remex */
>>> $doc->documentElement->firstChild->tagName;
=> "head"
>>> $doc2->documentElement->firstChild->tagName;
=> "head"
>>> $doc->documentElement->firstChild->nodeName;
=> "head"
>>> $doc2->documentElement->firstChild->nodeName;
=> "head"

The PHP DOM implementation respects case-sensitivity (which it actually shouldn't):

>>> $doc2->createElement('p')->nodeName;
=> "p"
>>> $doc->createElement('p')->nodeName;
=> "p"
>>> $doc2->createElement('P')->nodeName;
=> "P"
>>> $doc->createElement('P')->nodeName;
=> "P"

Compare to JS in the browser:

> document.createElement('p').tagName
"P"

Remex should probably:

  1. provide an option to uppercase HTML tag names prior to passing them to createElement(), and/or
  2. allow passing in a different DOMImplementation to RemexHtml\DOM\DOMBuilder to provide proper behavior for html/

Option 1 would, in the short term, allow Parsoid to continue to use uppercase when comparing tagName strings; it would have to take care to always use uppercase when calling createElement though. This would be a bridge to option 2, once we have a proper spec-compliant DOM implementation (T215000).

Event Timeline

cscott created this task.Mar 5 2019, 7:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 5 2019, 7:56 PM
cscott updated the task description. (Show Details)Mar 5 2019, 8:08 PM