Page MenuHomePhabricator

cscott (C. Scott Ananian)
Parser whisperer

Projects (25)

Today

  • No visible events.

Tomorrow

  • No visible events.

Wednesday

  • No visible events.

User Details

User Since
Oct 21 2014, 6:47 PM (598 w, 6 d)
Availability
Available
IRC Nick
cscott
LDAP User
C. Scott Ananian
MediaWiki User
Cscott [ Global Accounts ]

Editor since 2005; WMF developer since 2013. I work on Parsoid and OCG, and dabble with VE, real-time collaboration, and OOjs.

On github: https://github.com/cscott

See https://en.wikipedia.org/wiki/User:cscott for more.

Recent Activity

Today

cscott closed T415068: PHP Deprecated: substr_count(): Passing null to parameter #1 ($haystack) of type string is deprecated as Resolved.
Mon, Apr 13, 3:37 PM · MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), Essential-Work, Content-Transform-Team (Work In Progress), PHP 8.1 support, MediaWiki-Parser, Wikimedia-production-error
cscott closed T415068: PHP Deprecated: substr_count(): Passing null to parameter #1 ($haystack) of type string is deprecated, a subtask of T421693: Bad UTF-8 errors, as Resolved.
Mon, Apr 13, 3:37 PM · Content-Transform-Team (Work In Progress)
cscott moved T36514: The language and the direction of the title in first heading should depend on page content language instead of user interface language from To Deploy to To Verify on the Content-Transform-Team (Work In Progress) board.
Mon, Apr 13, 3:34 PM · MW-1.46-notes (1.46.0-wmf.23; 2026-04-07), Content-Transform-Team (Work In Progress), Essential-Work, Patch-For-Review, RTL, I18n, MediaWiki-Internationalization
cscott moved T420051: plwiki: Missing coordinates at top of page from Q3 FY25-26 to In Progress on the Content-Transform-Team (Work In Progress) board.
Mon, Apr 13, 3:26 PM · Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress)
cscott moved T411280: Links in <mapframe> broken when using Parsoid from Q3 FY25-26 to In Progress on the Content-Transform-Team (Work In Progress) board.
Mon, Apr 13, 3:26 PM · Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), Parsoid, Maps (Kartographer)
cscott placed T417514: Remove `thumbsize` from ParserOptions up for grabs.
Mon, Apr 13, 3:25 PM · Readers Essential Work, Patch-For-Review, OKR-Work, Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Small Size Wikipedias)
cscott moved T394836: Use refactored grammar for `{{....}}` constructs from Backlog to Code Review on the Content-Transform-Team (Work In Progress) board.
Mon, Apr 13, 3:20 PM · Content-Transform-Team (Work In Progress), Patch-For-Review, Parsoid-Read-Views (Performance), Parsoid
cscott moved T422962: Parsoid doesn't look for red links inside language conversion blocks from Backlog to Code Review on the Content-Transform-Team (Work In Progress) board.
Mon, Apr 13, 3:07 PM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress)
cscott added a comment to T200517: Emit lint error or category when a page uses duplicate HTML IDs.

I think it would also be helpful to determine why the template in question (assuming it is a template) is adding id attributes. These are supposed to be unique by the HTML spec. Perhaps id is not the right attribute to generate, and these should either be removed or changed to something like data-mw-.., which don't have to be unique.

Mon, Apr 13, 2:21 PM · MW-1.43-notes (1.43.0-wmf.28; 2024-10-22), Essential-Work, Content-Transform-Team-WIP, Parsoid, MediaWiki-extensions-Linter

Sat, Apr 11

cscott renamed T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki from Parsoid LanguageConverter implementation doesn't support ::guessVariant on srwiki to Legacy LanguageConverter uses top-level ::guessVariant on srwiki.
Sat, Apr 11, 5:14 AM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)
cscott merged T421738: Legacy LanguageConverter seems not to convert certain pages into T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki.
Sat, Apr 11, 5:09 AM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)
cscott merged task T421738: Legacy LanguageConverter seems not to convert certain pages into T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki.
Sat, Apr 11, 5:08 AM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott added a comment to T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki.

Ok, on investigation, Parsoid does invoke guessVariant(), but the legacy Parsoid invokes it *twice*: once on the overall text of any string to be converted (including the embedded html tags and attribtues) and then again on the text substrings between tags. That seems to be a bug: if the topmost 'guess' returns false, then nothing on the page will be converted at all. It seems like the intended behavior is for the individual strings / paragraphs / etc to be the proper subjects of "guessing".

Sat, Apr 11, 3:24 AM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)

Fri, Apr 10

cscott renamed T422962: Parsoid doesn't look for red links inside language conversion blocks from Parsoid LanguageConverter doesn't look for red links inside language conversion blocks to Parsoid doesn't look for red links inside language conversion blocks.
Fri, Apr 10, 10:59 PM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress)
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

It's possible dewiki can work around the firefox bug with some client-side JS using jquery like:

$("span[typeof="mw:Entity"]:contains(\u00AD)").css('word-break','break-all');

I haven't tested that, it's just a gesture in the general direction of a possible workaround for firefox users.

Fri, Apr 10, 6:36 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

Oh, it appears to be a Firefox bug, and I'm testing on Chrome. Not a parser bug, I don't think, just a firefox bug.

Fri, Apr 10, 6:24 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

Actually that works fine. I'm confused about what the reporter sees as a problem here:

image.png (521×730 px, 116 KB)

Fri, Apr 10, 6:23 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

I think the issue is that the browser doesn't chose to break at the &shy; if it is inside a <span>. The Html for the table in the report definitely has the entity inside:

<th id="mwBDM">Frühstücksdirektorenkonferenz<span typeof="mw:Entity" id="mwBDQ">&shy;</span>tagesordnungspunkt</th>

but the browser doesn't break the word there. Someone's going to have to study the unicode-word-break-algorithm-as-implemented-in-browsers to figure out why and if there's a fix. Maybe just some css would do the trick.

Fri, Apr 10, 6:22 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

The character is certainly present in:

$ echo 'a&shy;b' | php bin/parse.php 
<p data-parsoid='{"dsr":[0,7,0,0]}'>a<span typeof="mw:Entity" data-parsoid='{"src":"&amp;shy;","srcContent":"­","dsr":[1,6,null,null]}'>­</span>b</p>

and I see it in the output of

<section data-mw-section-id="0" id="mwAQ"><p id="mwAg">a<span typeof="mw:Entity" id="mwAw">­</span>b</p>

as well. I think your terminal just didn't chose to display it, since technically it's an 'optional hyphen'.

Fri, Apr 10, 6:17 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T422960: Parsoid seems to be mishandling &shy; entity.

That's a soft hyphen, your terminal might not display it.

Fri, Apr 10, 6:15 PM · Content-Transform-Team, Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott created T422966: Parsoid tokenizer doesn't allow table markup inside language conversion brackets.
Fri, Apr 10, 6:10 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott added projects to T422965: Parsoid LanguageConverter prepends 'alt=' on broken media: Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress).
Fri, Apr 10, 6:02 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott created T422965: Parsoid LanguageConverter prepends 'alt=' on broken media.
Fri, Apr 10, 6:01 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott created T422962: Parsoid doesn't look for red links inside language conversion blocks.
Fri, Apr 10, 5:58 PM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress)
cscott created T422961: LanguageConverter doesn't convert inside <indicator>.
Fri, Apr 10, 5:55 PM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress)
cscott updated the task description for T421436: Deploy Produnto extension to production.
Fri, Apr 10, 3:32 AM · Wikimedia-extension-review-queue, Wikimedia-Extension-setup, Produnto

Thu, Apr 9

cscott closed T415435: Add temporary URL request parameter to opt-in to the new Parsoid LanguageConverter implementation as Resolved.
Thu, Apr 9, 10:49 PM · Patch-For-Review, Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support), OKR-Work
cscott closed T415435: Add temporary URL request parameter to opt-in to the new Parsoid LanguageConverter implementation, a subtask of T380517: Make Parsoid language conversion into an OutputTransform pass, as Resolved.
Thu, Apr 9, 10:49 PM · MW-1.46-notes (1.46.0-wmf.15; 2026-02-10), Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support), OKR-Work
cscott closed T421194: TypeError: MediaWiki\Parser\Parser::localizeTOC(): Argument #2 ($lang) must be of type MediaWiki\Language\Language, null given, called in /srv/mediawiki/php-1.46.0-wmf.20/includes/OutputTransform/Stages/ParsoidLanguageConverter, a subtask of T407379: Numbering in TOC is not localized when using Parsoid rendering, as Resolved.
Thu, Apr 9, 10:49 PM · Parsoid-Read-Views (Language Converter Support), OKR-Work, Content-Transform-Team (Work In Progress), I18n, Parsoid
cscott closed T421194: TypeError: MediaWiki\Parser\Parser::localizeTOC(): Argument #2 ($lang) must be of type MediaWiki\Language\Language, null given, called in /srv/mediawiki/php-1.46.0-wmf.20/includes/OutputTransform/Stages/ParsoidLanguageConverter as Resolved.
Thu, Apr 9, 10:49 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support), Wikimedia-production-error
cscott closed T419264: Title character error on zhwiki as Resolved.

This appears to be fixed, likely via 1a304c1526c8caf8f70a86ee929f69e4fcaa8e7b.

Thu, Apr 9, 10:47 PM · Chinese-Sites, Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott added a comment to T419826: Use a bloom filter for looking up disambig pages.

I don't think you need to resort to a maintenance script for this. If you're able to maintain a bit more storage, you can just represent each entry in the bloom filter with a few-bit counter instead of an int, and update the counter value whenever a new disambiguation page is created/removed (aka, whenever DISAMBIG is added/removed from a page). Edits which create/remove disambiguation pages should be rare.

Thu, Apr 9, 10:47 PM · Content-Transform-Team, DBA, MediaWiki-extensions-Disambiguator, Performance Issue
cscott closed T421629: TOC missing with Parsoid on some wikis (except for Vector 2022) as Resolved.

Tested & verified fixed. You might have to purge the cache in some cases, but most pages shouldn't need this.

Thu, Apr 9, 7:32 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), FlaggedRevs, Content-Transform-Team (Work In Progress), Parsoid-Read-Views, Timeless, MonoBook, Hungarian-Sites, Vector (legacy skin)
cscott added a comment to T345481: Migrate Parser and extension tests away from deprecated PHPUnit TestSuite subclassing.

I created a subtask, T422866: Migrate parser tests to new phpunit:config mechanism, for the parser test-related piece of this. CTT isn't likely to get to this before next quarter though.

Thu, Apr 9, 6:46 PM · MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), Patch-For-Review, Content-Transform-Team, User-Daimona, Essential-Work, MediaWiki-Parser, MediaWiki-Core-Tests
cscott created T422866: Migrate parser tests to new phpunit:config mechanism.
Thu, Apr 9, 6:45 PM · Content-Transform-Team, Essential-Work, MediaWiki-Parser, MediaWiki-Core-Tests
cscott closed T320894: The NAMESPACE magic word is not normalized for female users any more as Declined.

This behavior has been in place for 3 years now; hopefully all of the pages involved have been fixed.

Thu, Apr 9, 3:21 PM · Regression, MediaWiki-Parser
cscott added a comment to T420323: Allow to disable a wikitext lint error locally.

Somewhat related to T17941: create magic word __NOCATEGORY__/T204370#11805009 which similarly wanted metadata to apply "to pages the template is included on" not "on the template page itself". Often <includeonly> and friends are used to manage this in the template context. Presumably you'd want to suppress the lint only for the template page, not for the pages it was included on.

Thu, Apr 9, 3:14 PM · Parsoid
cscott added a comment to T204370: Behavior switch/magic word uniformity.

T17941: create magic word __NOCATEGORY__ could be solved by using {{#category:...}} syntax for [[Category:...]], which would allow additional options of the sort requested there to be added.

Thu, Apr 9, 3:12 PM · MW-1.40-notes (1.40.0-wmf.25; 2023-02-27), MediaWiki-Parser, Parsoid
cscott closed T422155: ptwiki: Collapsed whitespace report on a page as Declined.

It looks like the problem is the display:flex in the style attribute on the wrapper is at fault. Parsoid generates <span> wrappers around the html entities in the wikitext, by design. The whitespace is present in the HTML, so that's not a fault of Parsoid.

Thu, Apr 9, 3:07 PM · Parsoid-Read-Views
cscott added a comment to T422302: Preview warnings are not displayed in any way when using the visual editor..

I think this is entirely on the VE side. Parsoid generates a ParserOutput, and the warnings should be present in ParserOutput::getWarningMsgs(). It's likely that VE's preview API doesn't actually return these to VE and/or VE doesn't have UX to display it, but I believe all the necessary information *should* be present from Parsoid.

Thu, Apr 9, 2:29 PM · VisualEditor
cscott added a comment to T422529: personal number format, and units.

Aren't there already functions to customize number formatting? We have Language::getNumberFormatter(), scribunto has number formatting instructions, etc. Technically I guess what's being asked is to surround any numbers in the output with a <span class="mw-number" data-mw-value="111"> so that client-side javascript can replace the number with the user's preferred format. You can already do this with templates in mediawiki and a user gadget. It requires buy-in from editors to add all this markup, which seems like the heavy lift here.

Thu, Apr 9, 2:13 PM · Content-Transform-Team, Reader Experience Team, VisualEditor, VisualEditor-MediaWiki-2017WikitextEditor
cscott merged T418569: MediaWiki\Context\RequestContext::getTitle called with no title set. into T422780: Production error: MediaWiki\Context\RequestContext::getTitle called with no title set..
Thu, Apr 9, 1:45 PM · Content-Transform-Team (Work In Progress), Patch-For-Review, Parsoid
cscott merged task T418569: MediaWiki\Context\RequestContext::getTitle called with no title set. into T422780: Production error: MediaWiki\Context\RequestContext::getTitle called with no title set..
Thu, Apr 9, 1:45 PM · Content-Transform-Team (Work In Progress), Essential-Work, MediaWiki-Special-pages, MediaWiki-Parser, Wikimedia-production-error
cscott added a comment to T422780: Production error: MediaWiki\Context\RequestContext::getTitle called with no title set..

Parsing without a title has been deprecated since MW 1.34 (T245129). I'm surprised this hasn't shown up before, but I don't think this is related to this week's roll out: it's probably just that some spider has decided to hit some special page (?) which triggers this error.

Thu, Apr 9, 1:23 PM · Content-Transform-Team (Work In Progress), Patch-For-Review, Parsoid
cscott added a comment to T418569: MediaWiki\Context\RequestContext::getTitle called with no title set..

Parsing without a title has been deprecated since MW 1.34 (T245129). I'm surprised this hasn't shown up before, but I don't think this is related to this week's roll out: it's probably just that some spider has decided to hit some special page (?) which triggers this error.

Thu, Apr 9, 1:22 PM · Content-Transform-Team (Work In Progress), Essential-Work, MediaWiki-Special-pages, MediaWiki-Parser, Wikimedia-production-error

Wed, Apr 8

cscott merged T422590: Pages sometimes parsed with seemingly mobile output format into T421629: TOC missing with Parsoid on some wikis (except for Vector 2022).
Wed, Apr 8, 4:19 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), FlaggedRevs, Content-Transform-Team (Work In Progress), Parsoid-Read-Views, Timeless, MonoBook, Hungarian-Sites, Vector (legacy skin)
cscott merged task T422590: Pages sometimes parsed with seemingly mobile output format into T421629: TOC missing with Parsoid on some wikis (except for Vector 2022).
Wed, Apr 8, 4:19 PM · MediaWiki-Parser, Content-Transform-Team
cscott closed T419897: Duplicated ToC in Vector 2022 as Resolved.

Resolved via https://gerrit.wikimedia.org/r/c/mediawiki/extensions/FlaggedRevs/+/1251173

Wed, Apr 8, 4:18 PM · Content-Transform-Team (Work In Progress), Vector 2022 (Tracking), FlaggedRevs
cscott added a comment to T421629: TOC missing with Parsoid on some wikis (except for Vector 2022).

Should be fixed when wmf.23 rolls out tomorrow (Apr 9).

Wed, Apr 8, 4:16 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), FlaggedRevs, Content-Transform-Team (Work In Progress), Parsoid-Read-Views, Timeless, MonoBook, Hungarian-Sites, Vector (legacy skin)

Tue, Apr 7

cscott updated the task description for T422524: Parsoid Read Views to deploy ~2026-04-07.
Tue, Apr 7, 10:21 PM · OKR-Work, Content-Transform-Team (Work In Progress)
cscott added a comment to T420481: 1.46.0-wmf.23 deployment blockers.

Just for info: Move createwithcontentmodel to autoconfirmed (1268225) was deployed just now as a config change, which adds a new permission which is not present in wmf.22 but is present in wmf.23 : Add createwithcontentmodel permission (1222750).

Tue, Apr 7, 8:56 PM · Release-Engineering-Team (Priority Backlog 📥), Essential-Work, Release, Train Deployments
cscott added a comment to T421529: Crash in DOMImplementation with php-dom installed: "ValueError: class_alias(): Argument #1 ($class) must be a user-defined class name, internal class name given" using MediaWiki 1.45.1 and PHP 8.2 / 8.3.

Also, aliasing a built-in class is legal in PHP 8.3, so this error (if reproducible) would only be applicable to PHP 8.2.

Tue, Apr 7, 8:10 PM · Parsoid
cscott added a comment to T421529: Crash in DOMImplementation with php-dom installed: "ValueError: class_alias(): Argument #1 ($class) must be a user-defined class name, internal class name given" using MediaWiki 1.45.1 and PHP 8.2 / 8.3.

Parsoid in MW 1.45 should require wikimedia/remex-html ^5.1.0 and core requires *exactly* 5.1.0. That version of remex doesn't seem to correspond to the line numbers you've given here, and in any case it shouldn't be trying to instantiate Parsoid's DOMImplementation but instead Remex's:
https://github.com/wikimedia/mediawiki-libs-RemexHtml/blob/c75f653afdfc42040e27d311236e7856ebedaa25/src/DOM/DOMBuilder.php#L121

	public function __construct( $options = [] ) {
		$options += [
			'errorCallback' => null,
			'domImplementation' => null,
			'domExceptionClass' => null,
		] + ( class_exists( '\Dom\Document' ) ? [
			'domImplementationClass' => '\Dom\Implementation',
		] : [
			'domImplementationClass' => \DOMImplementation::class,
		] );
		$this->errorCallback = $options['errorCallback'];
		$this->domImplementation = $options['domImplementation'] ??
			new $options['domImplementationClass'];

And Parsoid's DOMFragmentBuilder shouldn't be mentioning a DOMImplementation class at all:
https://github.com/wikimedia/mediawiki-services-parsoid/blob/REL1_45/src/Wt2Html/TreeBuilder/ParsoidDOMFragmentBuilder.php#L20

	/** @param Document $ownerDocument */
	public function __construct( $ownerDocument ) {
		'@phan-var \DOMDocument $ownerDocument'; // Remex pretends everything is \DOM
		parent::__construct( $ownerDocument, [
			'suppressIdAttribute' => DOMCompat::isUsingDodo(),
		] );
	}
Tue, Apr 7, 8:10 PM · Parsoid
cscott updated the task description for T422524: Parsoid Read Views to deploy ~2026-04-07.
Tue, Apr 7, 7:16 PM · OKR-Work, Content-Transform-Team (Work In Progress)
cscott added projects to T422543: Deploy Parsoid Read Views to MobileFrontEnd readers on enwiki: Content-Transform-Team (Work In Progress), OKR-Work.
Tue, Apr 7, 6:45 PM · MW-1.46-notes (1.46.0-wmf.24; 2026-04-14), OKR-Work, Content-Transform-Team (Work In Progress)
cscott created T422543: Deploy Parsoid Read Views to MobileFrontEnd readers on enwiki.
Tue, Apr 7, 6:45 PM · MW-1.46-notes (1.46.0-wmf.24; 2026-04-14), OKR-Work, Content-Transform-Team (Work In Progress)
cscott updated the task description for T422524: Parsoid Read Views to deploy ~2026-04-07.
Tue, Apr 7, 3:50 PM · OKR-Work, Content-Transform-Team (Work In Progress)
cscott created T422524: Parsoid Read Views to deploy ~2026-04-07.
Tue, Apr 7, 3:49 PM · OKR-Work, Content-Transform-Team (Work In Progress)
cscott moved T413545: Tag new parsoid releases for existing REL1_XX branches from To Verify to In Progress on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:51 PM · Patch-For-Review, MW-1.43-notes, MW-1.44-notes, MW-1.45-notes, Parsoid (Tracking), Content-Transform-Team (Work In Progress), Essential-Work, PHP 8.5 support, MW-1.45-release, MW-1.44-release, MW-1.43-release, Release
cscott moved T36514: The language and the direction of the title in first heading should depend on page content language instead of user interface language from Code Review to To Deploy on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:50 PM · MW-1.46-notes (1.46.0-wmf.23; 2026-04-07), Content-Transform-Team (Work In Progress), Essential-Work, Patch-For-Review, RTL, I18n, MediaWiki-Internationalization
cscott moved T409751: Lazy loading of data-mw and data-parsoid from In Progress to Code Review on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:46 PM · Patch-For-Review, OKR-Work, Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Performance)
cscott moved T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki from Backlog to Needs Investigation on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:43 PM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)
cscott moved T421194: TypeError: MediaWiki\Parser\Parser::localizeTOC(): Argument #2 ($lang) must be of type MediaWiki\Language\Language, null given, called in /srv/mediawiki/php-1.46.0-wmf.20/includes/OutputTransform/Stages/ParsoidLanguageConverter from Backlog to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:43 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support), Wikimedia-production-error
cscott moved T267067: Make language variant a parser option from Q3 FY25-26 to In Progress on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:43 PM · Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress), MediaWiki-Language-converter, Parsoid, MediaWiki-Parser
cscott moved T407379: Numbering in TOC is not localized when using Parsoid rendering from Code Review to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Apr 7, 2:42 PM · Parsoid-Read-Views (Language Converter Support), OKR-Work, Content-Transform-Team (Work In Progress), I18n, Parsoid

Fri, Apr 3

cscott created T422265: Inject dependencies into Parser's CoreTagHooks.
Fri, Apr 3, 5:26 PM · MW-1.46-notes (1.46.0-wmf.23; 2026-04-07), MediaWiki-Parser, Essential-Work, Dependency injection
cscott added a comment to T314399: OutputPage::getUnprefixedDisplayTitle() is unreliable.

The display title should probably include a language component as well, so we can properly set lang/dir attributes: T36514: The language and the direction of the title in first heading should depend on page content language instead of user interface language.

Fri, Apr 3, 3:13 AM · Patch-Needs-Improvement, MediaWiki-General

Thu, Apr 2

cscott added a comment to T421629: TOC missing with Parsoid on some wikis (except for Vector 2022).

Two separate bugs: first, the code in ParserOutputAccess which dumps the parser cache key shows the key as it would be used in the primary (latest revision) cache, but in this case output was coming from the secondary (old revision) cache so the dumped key was misleading.  The old revision cache was actually omitting all the postprocessing options from the key, which caused it to fetch output for the "wrong" skin from the cache.  Fixed in Ensure RevisionOutputCache uses post-processing options where appropriate (1267124).

Thu, Apr 2, 5:12 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), FlaggedRevs, Content-Transform-Team (Work In Progress), Parsoid-Read-Views, Timeless, MonoBook, Hungarian-Sites, Vector (legacy skin)
cscott added a comment to T421629: TOC missing with Parsoid on some wikis (except for Vector 2022).

Looking at, eg, https://als.wikipedia.org/w/index.php?title=Photosynthese&oldid=1074722&useskin=monobook the limit report says:

<!-- 
NewPP limit report
Parsed by mw‐web.eqiad.main‐6dbf997859‐jj4pp
Cached time: 20260402143044
Cache expiry: 2592000
Reduced expiry: false
Complications: [show‐toc, use‐parsoid]
CPU time usage: 0.942 seconds
Real time usage: 1.948 seconds
Preprocessor visited node count: 836/1000000
Revision size: 79883/2097152 bytes
Post‐expand include size: 10979/2097152 bytes
Template argument size: 2186/2097152 bytes
Highest expansion depth: 15/100
Expensive parser function count: 8/500
Unstrip recursion depth: 0/20
Unstrip post‐expand size: 1152/5000000 bytes
Lua time usage: 0.093/10.000 seconds
Lua memory usage: 4417372/52428800 bytes
Number of Wikibase entities loaded: 1/500
-->
<!-- Saved in RevisionOutputCache with key alswiki:parsoid-rcache:1074722:dateformat=default!useParsoid=1!userlang=en and timestamp 20260402143044 and revision id 1074722.
 -->
<!--
Post‐processing cache key alswiki:postproc‐parsoid‐pcache:44770:|#|:idhash:enableSectionEditLinks=0!injectTOC=0!postproc=1!skin=vector‐2022!useParsoid=1, generated at 20260402143045
-->

Note that the skin in the postproc cache key is vector-2022 despite useskin=monobook in the URL. This suggests that the way we are fetching the skin here isn't working. I'd expect the result would be "vector-2022 style" section edit links, even though we're supposed to be using monobook, in addition to the TOC differences.

Thu, Apr 2, 2:37 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), FlaggedRevs, Content-Transform-Team (Work In Progress), Parsoid-Read-Views, Timeless, MonoBook, Hungarian-Sites, Vector (legacy skin)

Wed, Apr 1

cscott added a comment to T419726: Parsoid fails to expand transclusion generated by a Lua module on mnwiki.

Here's another interesting test case:

Wed, Apr 1, 2:36 PM · MW-1.46-notes (1.46.0-wmf.23; 2026-04-07), Parsoid-Read-Views (Large Wikipedias), Content-Transform-Team (Work In Progress), Parsoid

Tue, Mar 31

cscott added a comment to T8104: Wrap each wiki page section contents in a container.

T55784 is marching along, we're shortly going to be able to close this 20-year-old feature request.

Tue, Mar 31, 7:48 PM · MediaWiki-Parser, Accessibility, CSS, MediaWiki-User-Interface
cscott added a comment to T421859: frwiki rendering difference with Parsoid.

This might be fixed by Arlo's patch this week (Html headings aren't section wrapped (1244822)), since it appears this is styling around raw <h2> tags generated by
https://fr.wikipedia.org/w/index.php?title=Template:Section%20d%C3%A9roulante%20d%C3%A9but&action=edit

Tue, Mar 31, 3:51 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Large Wikipedias), Parsoid
cscott added a comment to T393376: Automatically abandon old changes after some time.

Some suggestions for reducing load:

  1. If possible, add a new state that exempts patches from this merge conflict testing. You could add this state via a tag, like mw-archived. For bonus points remove that particular tag on any activity.
  2. If that's too much work/upstream divergence, at least add a hashtag (like archived) to the patch when abandoning it, and a comment about why it was abandoned and how to restore it. For bonus points, the default "my patches" dashboard can include "owner:self tag:archived" for "my old patches" to make it more obvious they haven't been lost. For bonus bonus points, remove the strike through styling for these patches, which makes them harder to read.
Tue, Mar 31, 12:03 PM · Gerrit, Release-Engineering-Team, collaboration-services

Mon, Mar 30

cscott added a comment to T417705: Difference in table row closing when switching from html to wikitext syntax cells.

Maybe add a lint for "mixing wikitext and table syntax" to make it clear this is considered not a good thing to do?

Mon, Mar 30, 3:43 PM · Parsoid, Parsoid-Read-Views (Large Wikipedias)
cscott updated the task description for T421738: Legacy LanguageConverter seems not to convert certain pages.
Mon, Mar 30, 3:25 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott moved T421738: Legacy LanguageConverter seems not to convert certain pages from Backlog to In Progress on the Content-Transform-Team (Work In Progress) board.
Mon, Mar 30, 3:09 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)
cscott created T421738: Legacy LanguageConverter seems not to convert certain pages.
Mon, Mar 30, 3:09 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Language Converter Support)

Fri, Mar 27

cscott added a comment to T421529: Crash in DOMImplementation with php-dom installed: "ValueError: class_alias(): Argument #1 ($class) must be a user-defined class name, internal class name given" using MediaWiki 1.45.1 and PHP 8.2 / 8.3.

What version of PHP were you using? I suspect it was not PHP 8.

Fri, Mar 27, 10:01 PM · Parsoid

Thu, Mar 26

cscott created T421436: Deploy Produnto extension to production.
Thu, Mar 26, 7:43 PM · Wikimedia-extension-review-queue, Wikimedia-Extension-setup, Produnto
cscott added a comment to T420832: DisamAssist's "link here" button won't show while using Parsoid.

The fix was posted a few months ago in https://es.wikipedia.org/wiki/Usuario_discusi%C3%B3n:Qwertyytrewqqwerty/DisamAssist-core.js#c-Cscott-20251218143100-Updating_for_Parsoid_read_views but hasn't been applied by the gadget author yet.

Thu, Mar 26, 3:59 PM · Content-Transform-Team
cscott added a comment to T420832: DisamAssist's "link here" button won't show while using Parsoid.

This is most likely a bug in the gadget, and not an issue in Parsoid per se.

Thu, Mar 26, 2:46 PM · Content-Transform-Team
cscott updated the task description for T421036: Section editing of content transcluded through a template is broken in parsoid.
Thu, Mar 26, 2:44 PM · Content-Transform-Team (Work In Progress), Parsoid
cscott added a parent task for T394836: Use refactored grammar for `{{....}}` constructs: T420060: Tokenizer: Bad markup in template arg prevents entire transclusion from being recognized.
Thu, Mar 26, 2:42 PM · Content-Transform-Team (Work In Progress), Patch-For-Review, Parsoid-Read-Views (Performance), Parsoid
cscott added a subtask for T420060: Tokenizer: Bad markup in template arg prevents entire transclusion from being recognized: T394836: Use refactored grammar for `{{....}}` constructs.
Thu, Mar 26, 2:42 PM · Parsoid
cscott closed T385806: Replace `{{#parsoid-fragment}}` with dedicated token as Resolved.
Thu, Mar 26, 2:36 PM · Patch-For-Review, Parsoid
cscott closed T385806: Replace `{{#parsoid-fragment}}` with dedicated token, a subtask of T388786: Follow up from Parsoid Fragment support, as Resolved.
Thu, Mar 26, 2:36 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid
cscott added a comment to T420060: Tokenizer: Bad markup in template arg prevents entire transclusion from being recognized.

WIP: Use template3 tokenization for native parser functions/template expansion (1189554) · Gerrit Code Review uses template3 tokenization for this, which should fix the problem eventually.

Thu, Mar 26, 2:32 PM · Parsoid
cscott updated the task description for T420357: Update Parsoid HTML spec with changes to Cite refs.
Thu, Mar 26, 2:29 PM · Cite, Content-Transform-Team, WMDE-TechWish-Maintenance
cscott added a comment to T420357: Update Parsoid HTML spec with changes to Cite refs.

Shall we do this collaboratively, or does WMDE want to make some edits first and ping Content-Transform-Team for review, or what? This should be documented at https://www.mediawiki.org/wiki/Specs/HTML/2.8.0/Extensions/Cite .

Thu, Mar 26, 2:29 PM · Cite, Content-Transform-Team, WMDE-TechWish-Maintenance
cscott added a comment to T420876: TemplateData format and paramOrder are not applied to templates inside <ref> tags in VisualEditor.

Parsoid uses TemplateData and/or the pre-existing order (preserved in data-parsoid) to order template arguments. @thiemowmde is correct that Visual Editor plays little role in this.

Thu, Mar 26, 2:24 PM · Cite, Content-Transform-Team (Work In Progress), Patch-For-Review, TemplateData, VisualEditor
cscott moved T421371: JsonConfig logs warnings with a translated http-bad-status from Backlog to Code Review on the Content-Transform-Team (Work In Progress) board.
Thu, Mar 26, 2:12 PM · MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), Content-Transform-Team (Work In Progress), JsonConfig

Wed, Mar 25

cscott added a comment to T393925: Parsoid should generate the <head> on the core side, from the ParserOutput metadata.

Request from @Ottomata is to include limit report data, including cache key info, as well. Basically a version of the RenderDebugInfo stage, but putting it into a <script> tag in the <head> or something like that.

Wed, Mar 25, 4:22 PM · MW-1.43-notes (1.43.0-wmf.23; 2024-09-17), Parsoid-Read-Views (Phase 4 - Parsoid generates metadata needed by core), Parsoid
cscott added a comment to T139159: Pre-save transforms aren't executed on VE<->WT switch, so e.g. signatures are nowiki'ed, pipe trick links are left unresolved.

This is a known "missing feature" but not a priority for either editing or Content-Transform-Team at this time.

Wed, Mar 25, 4:15 PM · VisualEditor-MediaWiki, Parsoid, VisualEditor
cscott added a comment to T139159: Pre-save transforms aren't executed on VE<->WT switch, so e.g. signatures are nowiki'ed, pipe trick links are left unresolved.

VE represents this as <mw:signature> node, and VE tricks Parsoid into accepting this as an "inline transclusion" with the contents "~~~~" and apparently that works during html2wt. So this is fixed (or works) in the VE-to-wikitext transition, what's apparently not working is the wikitext -to- VE transition.

Wed, Mar 25, 4:14 PM · VisualEditor-MediaWiki, Parsoid, VisualEditor
cscott added a comment to T417819: PHP Deprecated: Use of MediaWiki\Parser\ParserOutput::setOutputFlag with non-standard flag was deprecated in MediaWiki 1.45. [Called from MediaWiki\Parser\ParserOutput::initFromJson].

Oh, @hashar says in T421206: PHP Deprecated: Use of MediaWiki\Parser\ParserOutput::setOutputFlag with non-standard flag was deprecated in MediaWiki 1.45. [Called from MediaWiki\Parser\ParserOutput::initFromJson]:

There was roughly 85 of them happening while I was promoting group 1 and stopped once the train command had completed. I can imagine they are entries cached by newly promoted wmf.21 which are read by old wmf.20 processes?

Wed, Mar 25, 2:01 PM · MW-1.43-notes, MW-1.44-notes, MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), MW-1.45-notes, MediaWiki-Parser, Wikimedia-production-error
cscott added a comment to T417819: PHP Deprecated: Use of MediaWiki\Parser\ParserOutput::setOutputFlag with non-standard flag was deprecated in MediaWiki 1.45. [Called from MediaWiki\Parser\ParserOutput::initFromJson].

Hm. The deprecation notice was added in I6363016b8bf1a09f104e475bfd949697d0df9a5c in Sep 2025. The warning should be triggered whenever any entry is added to the cache, which hasn't been happening for months now, far longer than a cache expiration time. And we've never deprecated or removed an existing parser output flag. So I'd understand if this happened during roll-back, when a flag existing in the "new" version that the "old" version didn't know about when it was rolled back. But this shouldn't ever be generated by roll-forward as far as I know.

Wed, Mar 25, 1:58 PM · MW-1.43-notes, MW-1.44-notes, MW-1.46-notes (1.46.0-wmf.22; 2026-03-31), MW-1.45-notes, MediaWiki-Parser, Wikimedia-production-error
cscott added a project to T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki: Parsoid-Read-Views (Language Converter Support).
Wed, Mar 25, 4:21 AM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)
cscott added a comment to T419328: Legacy LanguageConverter uses top-level ::guessVariant on srwiki.

We should drop guessVariant and decide a way to set different Wikitext source code language instead.

Wed, Mar 25, 4:20 AM · Patch-For-Review, Parsoid-Read-Views (Language Converter Support), MediaWiki-Language-converter, Content-Transform-Team (Work In Progress)
cscott changed the status of T407379: Numbering in TOC is not localized when using Parsoid rendering from Open to In Progress.

(^ re the move to 'To Verify') I can still personally repro this from the instructions in T407379#11287964.

Wed, Mar 25, 3:53 AM · Parsoid-Read-Views (Language Converter Support), OKR-Work, Content-Transform-Team (Work In Progress), I18n, Parsoid
cscott added a subtask for T407379: Numbering in TOC is not localized when using Parsoid rendering: T421194: TypeError: MediaWiki\Parser\Parser::localizeTOC(): Argument #2 ($lang) must be of type MediaWiki\Language\Language, null given, called in /srv/mediawiki/php-1.46.0-wmf.20/includes/OutputTransform/Stages/ParsoidLanguageConverter.
Wed, Mar 25, 3:52 AM · Parsoid-Read-Views (Language Converter Support), OKR-Work, Content-Transform-Team (Work In Progress), I18n, Parsoid