Page MenuHomePhabricator

flattenIfArray should just accept null/undefined result
Open, LowPublic

Description

parsoid/lib/wt2html/tokenizer.utils.js has an annoying throw statement

			for (var i = 0; i < e.length; i++) {
				var v = e[i];
				if (Array.isArray(v)) {
					// Change in assumption from a shallow array to a nested array.
					if (res === null) { res = e.slice(0, i); }
					internalFlatten(v, res);
				} else if (v !== null && v !== undefined) {
					if (res !== null) {
						res.push(v);
					}
				} else {
					throw new Error("falsy " + e);
				}
			}

Because of it, pegTokenizer.pegjs has a bunch of r:something { return r; }, making the compiled WikiPEG code inefficient.

We should just comment out the throw statement and remove { return r; } or similar code in pegTokenizer.pegjs if return tu.flattenStringlist(r); is called in downstream code.

Event Timeline

Change 508176 had a related patch set uploaded (by Dan1wang; owner: Dan1wang):
[mediawiki/services/parsoid@master] All NULL in flattenIfArray and remove superfluous return statements (T222560)

https://gerrit.wikimedia.org/r/508176

Change 599457 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Cleanup some superfluous actions in the grammar

https://gerrit.wikimedia.org/r/599457

The history of that throw is that the negative predicate return values were leaking into the token stream.

https://github.com/wikimedia/parsoid/commit/725aadfbf874ca64e831664367d48e8fa5d492cd
https://github.com/wikimedia/parsoid/commit/724749aac29814b97ce29081ea17e7a69d626e32

If we're going to rely on TokenizerUtils::flattenIfArray to remove them, we need to be more explicit about it and audit these changes carefully.

Change 599457 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Cleanup some superfluous actions in the grammar

https://gerrit.wikimedia.org/r/599457

Change 603571 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a16

https://gerrit.wikimedia.org/r/603571

Change 603571 merged by jenkins-bot:
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a16

https://gerrit.wikimedia.org/r/603571

Pppery subscribed.

All patches appear to have been merged. Can this be closed as resolved?

All patches appear to have been merged. Can this be closed as resolved?

I don't think so, see T222560#6174806