Page MenuHomePhabricator

cscott (C. Scott Ananian)
Parser whisperer

Projects (18)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 21 2014, 6:47 PM (325 w, 6 d)
Availability
Available
IRC Nick
cscott
LDAP User
Unknown
MediaWiki User
Cscott [ Global Accounts ]

Editor since 2005; WMF developer since 2013. I work on Parsoid and OCG, and dabble with VE, real-time collaboration, and OOjs.

On github: https://github.com/cscott

See https://en.wikipedia.org/wiki/User:cscott for more.

Recent Activity

Thu, Jan 14

cscott added a comment to T9356: User-specified HTML IDs can be the same as interface IDs.

I think it's high time that the Parser/Linker maintain a list of interface-reserved prefixes (like n-, p-, and mw-), as well as a (short) list of legacy IDs (such as footer), that are automatically mapped to a different name to avoid clashes with interace styles.

For example, by prepending it with h- for heading, or something like that. For compatibility this would of course be limited only to where it is causing potential conflicts. Doing this for the other 99.9% of headings is out of scope for this task.

Thu, Jan 14, 3:15 PM · MediaWiki-Parser

Tue, Jan 12

cscott created T271863: Quibble runs core *integration* tests against Parsoid-as-an-extension, not *unit* tests.
Tue, Jan 12, 9:16 PM · Patch-For-Review, Quibble, Parsoid
cscott committed rEDISf9e5a56bf8d4: Move Parsoid disambiguator parser tests to Extension:Disambiguator (authored by cscott).
Move Parsoid disambiguator parser tests to Extension:Disambiguator
Tue, Jan 12, 8:31 PM

Mon, Jan 11

cscott updated the task description for T271724: Document WebIDL binding for PHP.
Mon, Jan 11, 5:58 PM · Parsoid (Dodo)
cscott renamed T271724: Document WebIDL binding for PHP from Write WebIDL binding for PHP to Document WebIDL binding for PHP.
Mon, Jan 11, 5:56 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 5:06 PM · Parsoid (Dodo)
cscott added a subtask for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library): T271730: Fix enough bugs in Dodo that Parsoid's parser tests run cleanly.
Mon, Jan 11, 5:04 PM · Parsoid (Dodo)
cscott added a parent task for T271730: Fix enough bugs in Dodo that Parsoid's parser tests run cleanly: T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 5:04 PM · Parsoid (Dodo)
cscott created T271730: Fix enough bugs in Dodo that Parsoid's parser tests run cleanly.
Mon, Jan 11, 5:04 PM · Parsoid (Dodo)
cscott updated the task description for T269270: Code generation of HTML*Element DOM classes.
Mon, Jan 11, 5:02 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 4:53 PM · Parsoid (Dodo)
cscott updated the task description for T269707: Integrate the DOM library with Zest.
Mon, Jan 11, 4:49 PM · Parsoid (Dodo)
cscott updated the task description for T269259: Set up test infrastructure for testing the library against standard spec test suites.
Mon, Jan 11, 4:44 PM · Parsoid (Dodo)
cscott updated the task description for T269262: Integrate the DOM library with RemexHtml.
Mon, Jan 11, 4:41 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 4:38 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 4:37 PM · Parsoid (Dodo)
cscott added a subtask for T217867: Port domino (or another spec-compliant DOM library) to PHP: T271728: Migration strategy from DOMDocument to Dodo.
Mon, Jan 11, 4:33 PM · Parsoid
cscott added a parent task for T271728: Migration strategy from DOMDocument to Dodo: T217867: Port domino (or another spec-compliant DOM library) to PHP.
Mon, Jan 11, 4:33 PM · Parsoid (Dodo)
cscott created T271728: Migration strategy from DOMDocument to Dodo.
Mon, Jan 11, 4:33 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 4:22 PM · Parsoid (Dodo)
cscott added a subtask for T269259: Set up test infrastructure for testing the library against standard spec test suites: T271724: Document WebIDL binding for PHP.
Mon, Jan 11, 4:19 PM · Parsoid (Dodo)
cscott added a parent task for T271724: Document WebIDL binding for PHP: T269259: Set up test infrastructure for testing the library against standard spec test suites.
Mon, Jan 11, 4:19 PM · Parsoid (Dodo)
cscott updated the task description for T269259: Set up test infrastructure for testing the library against standard spec test suites.
Mon, Jan 11, 4:19 PM · Parsoid (Dodo)
cscott added a subtask for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library): T271724: Document WebIDL binding for PHP.
Mon, Jan 11, 4:18 PM · Parsoid (Dodo)
cscott added a parent task for T271724: Document WebIDL binding for PHP: T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 4:18 PM · Parsoid (Dodo)
cscott created T271724: Document WebIDL binding for PHP.
Mon, Jan 11, 4:17 PM · Parsoid (Dodo)
cscott added a comment to T269271: Settle on a suitable name for the DOM library.

\Wikimedia\DoDo maybe? Makes "DOm DOcument" clearer? OTOH, maybe reads as the repeated imperative "do do" instead of the bird.

Mon, Jan 11, 4:16 PM · Parsoid
cscott updated the task description for T269259: Set up test infrastructure for testing the library against standard spec test suites.
Mon, Jan 11, 3:55 PM · Parsoid (Dodo)
cscott updated the task description for T269259: Set up test infrastructure for testing the library against standard spec test suites.
Mon, Jan 11, 3:51 PM · Parsoid (Dodo)
cscott updated the task description for T269254: Complete porting of Dodo (PHP port of Domino node.js DOM library).
Mon, Jan 11, 3:28 PM · Parsoid (Dodo)

Sun, Jan 10

cscott added a comment to T263082: add <langconvert> parser tag.

Can we open a new phab task for this? I apologize for not noticing/flagging this earlier. There are a number of tasks already in phab to deprecate and remove the old mediawiki codes (including sr-ec, sr-el, etc) and it would be a significant step backwards to have the old names written into article wikitext, which would require manually updating all that wikitext in the future.

Sun, Jan 10, 3:21 PM · User-notice, MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid (Tracking), MediaWiki-Language-converter, Chinese-Sites, MediaWiki-Parser, Patch-For-Review

Fri, Jan 8

cscott created T271562: Remove special parsoidsvc-parsertests-docker job.
Fri, Jan 8, 5:13 PM · Patch-For-Review, Continuous-Integration-Config, Parsoid
cscott claimed T69486: Links: Add support for self-links to Parsoid.

Not necessarily going to work on this immediately (I've got higher-priority parser tests tasks) but since I added the GetLinkColors hook to core/Parsoid I'll provisionally claim this task.

Fri, Jan 8, 4:17 PM · Parsoid-Rendering, Growth-Team, Collaboration-Team-Triage, StructuredDiscussions, Parsoid
cscott added a comment to T69486: Links: Add support for self-links to Parsoid.

@GWicke's idea about putting the "document identity" in the CSS is interesting, so that a link could be styled as a self-link (or not) depending on the CSS that is applied to it.

Fri, Jan 8, 3:43 PM · Parsoid-Rendering, Growth-Team, Collaboration-Team-Triage, StructuredDiscussions, Parsoid

Wed, Jan 6

cscott added a comment to T265033: MediaWikiIntegrationTestCase does not clear tablesUsed before first test.

I think addDBDataOnce is more fundamentally broken, and shouldn't be used.

Wed, Jan 6, 9:02 PM · MediaWiki-Core-Testing
cscott added a comment to T271287: Parsoid CI broken by Rest\Handler\LanguageLinksHandlerTest.

Related Q: how can we make code CI run your test suite so that it doesn't just break Parsoid CI? Core CI *does* run some tests in a mode where Parsoid is installed -- can we add your tests to that group?

Wed, Jan 6, 6:33 PM · Platform Team Workboards (Clinic Duty Team), MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Patch-For-Review, Parsoid
cscott added a comment to T271287: Parsoid CI broken by Rest\Handler\LanguageLinksHandlerTest.

Yes, the Parser test runner setup creates its own interwiki table (using wgInterwikiCache) so that test results are not dependent on the host wiki configuration.

Wed, Jan 6, 6:32 PM · Platform Team Workboards (Clinic Duty Team), MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Patch-For-Review, Parsoid

Wed, Dec 23

cscott created T270777: Wikibase Client won't let ParserTests delete articles during cleanup.
Wed, Dec 23, 4:44 PM · MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Patch-For-Review, Wikidata, MediaWiki-extensions-WikibaseClient, Parsoid

Tue, Dec 22

cscott added a comment to T270444: Parsoid needs a bidirectional interwiki map (and hooks).

The local/global/site interwiki tables are implemented in the CDB caching, that's not expected to change.

Tue, Dec 22, 9:34 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, MediaWiki-Interwiki, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-InterwikiExtracts, Parsoid
cscott added a comment to T47096: Add a way to transclude template or other page in the correct language.

Some comments left on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/617294 -- see if you can determine if the $deps array is correct or not.

Tue, Dec 22, 8:20 PM · Language-Team (Language-2021-January-March), Platform Team Workboards (External Code Reviews), Parsoid, Patch-For-Review, MediaWiki-extensions-Translate

Mon, Dec 21

cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/649755 is my recommended fix here. It's been waiting for review for a while.

Mon, Dec 21, 9:55 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T269750: PegTokenizer: UTF-8 errors.

Yeah, this is a bug in the lua code. I've attempted to contact the author: https://fr.wikipedia.org/w/index.php?title=Discussion_module%3ACoordinates&type=revision&diff=177884216&oldid=173976505

Mon, Dec 21, 3:40 PM · Parsoid, Wikimedia-production-error
cscott added a comment to T269750: PegTokenizer: UTF-8 errors.

I strongly suspect that someone is converting "-71.3" degrees to "71.3 S" by chopping off the first *byte*, instead of the first *character*.

Mon, Dec 21, 3:25 PM · Parsoid, Wikimedia-production-error
cscott added a comment to T269750: PegTokenizer: UTF-8 errors.

The unicode minus sign is from formatnum -- it shouldn't be getting chopped up into bad UTF-8, unless someone somewhere it doing a naive substr(1, ...) or something like that. I'll look.

Mon, Dec 21, 2:57 PM · Parsoid, Wikimedia-production-error
cscott added a comment to T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity.

I bet something like __NO_P_WRAP__ would be fairly easy to support. Would it get enough adoption to get us closer to our goal of turning it off by default?

Mon, Dec 21, 2:54 PM · MediaWiki-Parser
cscott added a comment to T259832: mediawiki-vendor submodule doesn't get automatically bumped on release branches.

But it looks like this isn't necessary, it happens already (as long as you don't hit the window between branch cut and branch commit merge). I don't know why/how, but it doesn't look like 618808 is required.

Mon, Dec 21, 2:52 PM · Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), User-brennen, Release-Engineering-Team (Deployment services), Patch-For-Review, Parsoid

Dec 19 2020

cscott added a comment to T270555: Obscure generator meta for improved security.

Oh, and Special:Version also gives credit to all of our active developers, which has a social importance which shouldn't be underestimated.

Dec 19 2020, 6:27 PM · MediaWiki-General
cscott added a comment to T270555: Obscure generator meta for improved security.

I use Special:Version as an active mediawiki developer all the time. Just sayin'.

Dec 19 2020, 6:27 PM · MediaWiki-General

Dec 18 2020

cscott updated the task description for T270444: Parsoid needs a bidirectional interwiki map (and hooks).
Dec 18 2020, 2:28 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, MediaWiki-Interwiki, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-InterwikiExtracts, Parsoid
cscott updated the task description for T270444: Parsoid needs a bidirectional interwiki map (and hooks).
Dec 18 2020, 2:26 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, MediaWiki-Interwiki, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-InterwikiExtracts, Parsoid

Dec 17 2020

cscott updated the task description for T270444: Parsoid needs a bidirectional interwiki map (and hooks).
Dec 17 2020, 11:06 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, MediaWiki-Interwiki, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-InterwikiExtracts, Parsoid
cscott added a comment to T41199: Recursive interwiki link handling should be possible.

See T270444: Parsoid needs a bidirectional interwiki map (and hooks) -- this mapping would have to be bidirectional to support Parsoid.

Dec 17 2020, 10:02 PM · I18n, MediaWiki-General
cscott added a comment to T113034: RFC: Overhaul Interwiki map, unify with Sites and WikiMap.

Any chance this is going to be taken up again?

Dec 17 2020, 10:00 PM · Platform Engineering Roadmap Decision Making, Platform Engineering, Wikidata-Ministry-Of-Magic-Tech-Debt, TechCom-RFC, User-Daniel, Proposal, MW-1.27-release (WMF-deploy-2016-05-03_(1.27.0-wmf.23)), MW-1.27-release-notes, MediaWiki-Interwiki, Wikidata, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-Interwiki
cscott created T270444: Parsoid needs a bidirectional interwiki map (and hooks).
Dec 17 2020, 9:59 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, MediaWiki-Interwiki, MediaWiki-Site-system, SiteMatrix, MediaWiki-extensions-InterwikiExtracts, Parsoid
cscott merged task T18715: All parser functions and magic words should allow # preceding into T204370: Behavior switch/magic word uniformity.
Dec 17 2020, 4:56 PM · MediaWiki-Parser, WorkType-NewFunctionality, MediaWiki-General
cscott merged T18715: All parser functions and magic words should allow # preceding into T204370: Behavior switch/magic word uniformity.
Dec 17 2020, 4:56 PM · MediaWiki-Parser, Parsoid
cscott added a comment to T18715: All parser functions and magic words should allow # preceding.

Agreed!

Dec 17 2020, 4:56 PM · MediaWiki-Parser, WorkType-NewFunctionality, MediaWiki-General

Dec 16 2020

cscott added a comment to T250230: Cache (expensive) Parsoid config properties in APC and/or memcache.

Note that core already does a lot of 'expensive' startup work wrt loading extensions/etc on every request. So it doesn't really make sense to super-optimize this when we're still sitting behind core startup. Although latency is certainly additive, we should get a quantitative sense for what percentage of the request startup time Parsoid is responsible for.

Dec 16 2020, 8:49 PM · Performance Issue, Parsoid
cscott created T270312: Parsoid's integrated test runner doesn't support all core parser test features.
Dec 16 2020, 4:48 PM · Parsoid
cscott created T270311: Parsoid's integrated test runner in core doesn't support modes other than wt2html.
Dec 16 2020, 4:45 PM · Parsoid
cscott created T270310: Parsoid PageConfigFactory should accept a revision record, not wikitextOverride.
Dec 16 2020, 4:31 PM · Parsoid
cscott created T270307: Parsoid needs a way to unregister extension modules for testing.
Dec 16 2020, 4:23 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

I don't particularly like option 4, because I think it's a little too 'magical'. It covers up the NFC normalization under selser, which makes it less likely to cause dirty diffs (good!) but more surprising when the same bug creeps into edited HTML. But maybe defense in depth is warranted. I think option 2 is necessary because I think there are plenty of cases where the action API should *not* be trying to normalize the input string -- just immediately adjacent to the area changed in option 1 we see an attempt to pass *compressed* HTML to ApiVisualEditorEdit. I'm sure that ran into all sorts of mysterious problems because the binary deflated string was being NFC normalized...

Dec 16 2020, 3:54 AM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

^ this pair of patches implements "option 2" above.

Dec 16 2020, 12:02 AM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools

Dec 15 2020

cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

Option 1: ^ the above is one possible fix here.

Dec 15 2020, 11:37 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

This is a very strange bug:

>>> json_encode(Validator::cleanUp("abc\u{2001}\u{2003}"))
=> ""abc\u2003\u2003""
>>> json_encode(Validator::NFD("abc\u{2001}\u{2003}"))
=> ""abc\u2003\u2003""
Dec 15 2020, 11:04 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

And of course since those are invisible characters, I need to look *real close* to see where that happened...
(but not all of them are invisible characters, I thought ed's test case had an omega...)

Dec 15 2020, 10:14 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

Oh, and indeed it has changed: the srcContent is \342\200\201 but the actual span contents are \342\200\203. How did *that* happen, I wonder?

Dec 15 2020, 10:13 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

Getting there (slowly):
OLD DOM:

<p id="mwAw" data-parsoid='{"dsr":[17,343,0,0]}'>Vi kan förvänta oss att bilden är komplicerad när det gäller huruvida individer från Göteborg har ett tionde fonem, och i vilka ord de i så fall uttalar med ett tionde fonem. Det kan finnas infödda människor med arbetaryrken som uttalar många typiska <i id="mwBA" data-parsoid='{"dsr":[279,285,2,2]}'>ô</i><span typeof="mw:Entity" id="mwBQ" data-parsoid='{"src":"&amp;#x2011;","srcContent":"‑","dsr":[285,293,null,null]}'>‑</span>ord med en regional form av <i id="mwBg" data-parsoid='{"dsr":[321,327,2,2]}'>å</i><span typeof="mw:Entity" id="mwBw" data-parsoid='{"src":"&amp;#x2001;","srcContent":" ","dsr":[327,335,null,null]}'> </span>fonemet.</p>

After DOM diff:

<p id="mwAw" data-parsoid='{"dsr":[17,343,0,0]}' data-parsoid-diff='{"id":4946,"diff":["subtree-changed"]}'>Vi kan förvänta oss att bilden är komplicerad när det gäller huruvida individer från Göteborg har ett tionde fonem, och i vilka ord de i så fall uttalar med ett tionde fonem. Det kan finnas infödda människor med arbetaryrken som uttalar många typiska <i id="mwBA" data-parsoid='{"dsr":[279,285,2,2]}'>ô</i><span typeof="mw:Entity" id="mwBQ" data-parsoid='{"src":"&amp;#x2011;","srcContent":"‑","dsr":[285,293,null,null]}'>‑</span>ord med en regional form av <i id="mwBg" data-parsoid='{"dsr":[321,327,2,2]}'>å</i><span typeof="mw:Entity" id="mwBw" data-parsoid='{"src":"&amp;#x2001;","srcContent":" ","dsr":[327,335,null,null]}' data-parsoid-diff='{"id":4946,"diff":["children-changed","subtree-changed"]}'><meta typeof="mw:DiffMarker/deleted" data-parsoid="{}"/> </span>fonemet.</p>

Everything looks good, but selser is marking the entity as deleted for some reason.

Dec 15 2020, 10:04 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

In my local test w/ RESTBase, I got this:


This looks like what i'd expect for missing data-parsoid -- the entity is still there, it's just been normalized.

Dec 15 2020, 8:18 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott updated the task description for T270199: Table of contents in Parsoid output.
Dec 15 2020, 6:05 PM · Parsoid-Rendering, Parsoid
cscott updated the task description for T269630: Parsoid should support section editing links.
Dec 15 2020, 6:04 PM · Parsoid
cscott created T270199: Table of contents in Parsoid output.
Dec 15 2020, 6:03 PM · Parsoid-Rendering, Parsoid
cscott added a comment to T268953: MW 1.35.0 "Error contacting the Parsoid/RESTBase server: http-bad-status" when editing a subpage.

I can confirm this is necessary to edit page titles containing slashes (whether they are subpages or not). I've added the apache information to the main VE configuration section: https://www.mediawiki.org/w/index.php?title=Extension:VisualEditor&type=revision&diff=4285160&oldid=4258839&diffmode=source

Dec 15 2020, 5:11 PM · Parsoid (Third-party), MW-1.35-release, VisualEditor
cscott added a comment to T47096: Add a way to transclude template or other page in the correct language.

Thanks for your work, left a comment on the patch.

Dec 15 2020, 3:38 PM · Language-Team (Language-2021-January-March), Platform Team Workboards (External Code Reviews), Parsoid, Patch-For-Review, MediaWiki-extensions-Translate
cscott added a comment to T51097: Use figure and figcaption HTML5 elements when possible.

From an accessibility standpoint, there may be reasons to emit a descriptive <figcaption> even if it is not visible to a sighted user.

Dec 15 2020, 3:33 PM · Parsoid, Patch-For-Review, MediaWiki-Parser, Parsing-Team--ARCHIVED, Accessibility, MediaWiki-Interface

Dec 14 2020

cscott added a comment to T104770: Add HTML5 <aside> to the parser whitelist.

T118517: [RFC] Use <figure> for media, coming soon to a wiki near you.

Dec 14 2020, 11:08 PM · MediaWiki-Parser
cscott added a comment to T269704: Default horizontal alignment of thumbnails should depend on content language, not the UI.

Opened T270116: Figures should support `inline-start` and `inline-end` alignments in addition to `left` and `right`. for the general issue of supporting start and end as image alignment options.

Dec 14 2020, 5:41 PM · MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Parsoid
cscott updated the task description for T270116: Figures should support `inline-start` and `inline-end` alignments in addition to `left` and `right`..
Dec 14 2020, 5:40 PM · Parsoid, MediaWiki-Parser
cscott created T270116: Figures should support `inline-start` and `inline-end` alignments in addition to `left` and `right`..
Dec 14 2020, 5:40 PM · Parsoid, MediaWiki-Parser
cscott added a comment to T269704: Default horizontal alignment of thumbnails should depend on content language, not the UI.

I *think* what we should be doing is adding a class like mw-align-start instead of choosing left or right in the Linker. That would be float: inline-start, which could be simulated with:

body[dir=ltr] .mw-align-start { float: left }
body[dir=rtl] .mw-align-start { float: right }

https://developer.mozilla.org/en-US/docs/Web/CSS/float

Dec 14 2020, 5:38 PM · MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Parsoid

Dec 10 2020

cscott added a comment to T237538: Merge Disambiguation in core or add hook.

We already have the GetLinkColors hook, called from LinkHolderArray, which Disambiguator uses to add the appropriate class.
Probably that hook is sufficient, we just need to restructure how the Parsoid DataAccess works. This would still require Disambiguator-specific information in the Parsoid 'API' backend, but that's probably reasonable.

Dec 10 2020, 3:23 PM · MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Patch-For-Review, Parsoid (Tracking), MediaWiki-extensions-Disambiguator
cscott added a comment to T104770: Add HTML5 <aside> to the parser whitelist.

We're probably getting to the same place from different directions: you're adding the media options to LST, I'm adding LST-like transclusion abilities to media. But yeah, that's the basic idea one way or the other. Key point is to specify the semantics rather than just add HTML tags onto the whitelist.

Dec 10 2020, 1:45 PM · MediaWiki-Parser
cscott added a comment to T104770: Add HTML5 <aside> to the parser whitelist.

Here's a strawman proposal, just to wrap up the discussion for the moment: we have a float and size mechanism for media, which uses <figure>. I'd be interested in thinking about how we might add 'text' as a different sort of 'media'. You could imagine syntax like: {{Text:/Foo|aside|left}} (which maybe would include text from PageName/Foo) which would set the proper wrapper tag (<aside>), role, and styling.

Dec 10 2020, 1:30 PM · MediaWiki-Parser
cscott added a comment to T104770: Add HTML5 <aside> to the parser whitelist.

I think <aside> like <section> is arguably part of the skin / meta-layout, not part of the article content. I've added lots of HTML5 elements to the whitelist, but I'd lean towards declining this one for now -- wikitext doesn't have a good page layout mechanism (although there are phab tasks for this, eg T90914: Provide semantic wiki-configurable styles for media display). It seems like a future page layout mechanism might want to generate <aside> itself, which would be complicated if we allowed wikitext to contain those tags directly.

Dec 10 2020, 3:12 AM · MediaWiki-Parser

Dec 9 2020

cscott added a comment to T47096: Add a way to transclude template or other page in the correct language.

@abi_ the latest version of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/617294 should be ready to go; can you verify that it will work for your patch set?

Dec 9 2020, 7:33 PM · Language-Team (Language-2021-January-March), Platform Team Workboards (External Code Reviews), Parsoid, Patch-For-Review, MediaWiki-extensions-Translate

Dec 8 2020

cscott added a comment to T233736: Testing the REST API in CI.

https://www.mediawiki.org/wiki/MediaWiki_API_integration_tests#Resetting_the_target_wiki

Dec 8 2020, 8:13 PM · Patch-For-Review, Parsoid
cscott added a comment to T233736: Testing the REST API in CI.

Currently if you comment check experimental on a gerrit patch it will run npm run api-testing.

Dec 8 2020, 7:55 PM · Patch-For-Review, Parsoid
cscott added a comment to T269685: /page/html endpoint broken when requesting language variants affecting /page/summary.

So we'll deploy -a19 to group0 with the usual train at 2000 UTC, and verify that Parsoid -a19 at least doesn't crash and burn and break -group0 before we then backport -a19 early to group1 and group2 in the backport window 2 hrs later at 0000 UTC. Does that timing work? If not, we can do the backport immediately after the train deploy, but we would like to see -a19 live on group0 at least for smoke testing before we go ahead and push it to all prod machines.

Dec 8 2020, 7:40 PM · Wikimedia-production-error, Parsoid, Wikipedia-iOS-App-Backlog, RESTBase, Platform Engineering, Product-Infrastructure-Team-Backlog, Wikipedia-Android-App-Backlog
cscott added a comment to T269685: /page/html endpoint broken when requesting language variants affecting /page/summary.

Ok, adding a patch to tonight's backport window which should resolve the issue (by early-deploying Parsoid -a19).

Dec 8 2020, 7:34 PM · Wikimedia-production-error, Parsoid, Wikipedia-iOS-App-Backlog, RESTBase, Platform Engineering, Product-Infrastructure-Team-Backlog, Wikipedia-Android-App-Backlog
cscott closed T259832: mediawiki-vendor submodule doesn't get automatically bumped on release branches as Resolved.

Not sure why this didn't work for wmf.4 -- maybe it was another case where our cherry-pick landed between the time the branch was cut and the branch commit merged, and so the automatic update didn't work properly.

Dec 8 2020, 5:12 PM · Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), User-brennen, Release-Engineering-Team (Deployment services), Patch-For-Review, Parsoid

Dec 7 2020

cscott created T269630: Parsoid should support section editing links.
Dec 7 2020, 9:29 PM · Parsoid
cscott added a comment to T269271: Settle on a suitable name for the DOM library.

\Wikimedia\DoDo\Document ? Or \Wikimedia\DODO\Document?

Dec 7 2020, 9:18 PM · Parsoid
cscott added a comment to T269508: VisualEditor gives 401 when behind basic auth.

Alternatively, maybe it's already possible to do this if you hard-code the HTTP username and password into the URL configured in $wgVirtualRestConfig['modules']['parsoid']['url']?

Dec 7 2020, 9:18 PM · Parsoid (Tracking), VisualEditor
cscott created T269629: Generalize rules about rendering-transparent content before sections.
Dec 7 2020, 9:14 PM · Parsoid
cscott updated subscribers of T269271: Settle on a suitable name for the DOM library.

I'm boring. I suggested \Wikimedia\DOM\Document, and calling it just the "Wikimedia DOM library" or something like that.

Dec 7 2020, 6:09 PM · Parsoid

Dec 1 2020

cscott added a comment to T269036: Category template separated from comment on zh.wiki.

In Parsoid this is WTUtils::isRenderingTransparentNode(), which seems to include:

  • HTML comments
  • "SOL-transparent links": <link> tags with rel attributes matching /(?:^|\s)mw:PageProp\/(?:Category|redirect|Language)(?=$|\s)/
  • <meta> tags which don't have `typeof="mw:StartTag" or "mw:EndTag" (these are orphaned literal HTML tags; they might be stripped before Parsoid emits its final HTML)
  • Fallback ID spans: <span typeof="mw:FallbackId"> (found inside headings, you can probably ignore these)
Dec 1 2020, 6:11 PM · User-Ryasmeen, Verified, MW-1.36-notes (1.36.0-wmf.22; 2020-12-15), Editing-team (FY2020-21 Kanban Board), DiscussionTools
cscott added a comment to T266140: HTML entity replaced by the Unicode character in an edit.

@Esanders suspects that when they parse the page using Remex they are somehow losing the entity. I'm not convinced, but ed's going to try to trace the html into and out of discussion tools to figure out more precisely what's going on.

Dec 1 2020, 6:05 PM · Editing-team (FY2020-21 Kanban Board), MW-1.36-notes (1.36.0-wmf.25; 2021-01-05), Parsoid, DiscussionTools
cscott added a comment to T269036: Category template separated from comment on zh.wiki.

Foo {{category template}} becomes <p>Foo</p><p><meta....></p> but in a comment :Foo {{category template}} puts the <meta> tag inside the list item as expected?

Dec 1 2020, 5:24 PM · User-Ryasmeen, Verified, MW-1.36-notes (1.36.0-wmf.22; 2020-12-15), Editing-team (FY2020-21 Kanban Board), DiscussionTools

Nov 30 2020

cscott added a comment to T47096: Add a way to transclude template or other page in the correct language.

@tstarling left a comment saying he was fine with my approach on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/617294 so it looks like I just have to update that patch and get it merged.

Nov 30 2020, 6:11 PM · Language-Team (Language-2021-January-March), Platform Team Workboards (External Code Reviews), Parsoid, Patch-For-Review, MediaWiki-extensions-Translate