Page MenuHomePhabricator

ssastry (Subramanya Sastry)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Oct 7 2014, 5:34 AM (232 w, 6 d)
Availability
Available
LDAP User
Subramanya Sastry
MediaWiki User
SSastry (WMF) [ Global Accounts ]

Recent Activity

Sat, Mar 23

ssastry raised the priority of T218112: Create an implementation of Env etc for testing by accessing a MediaWiki installation via the Action API from Normal to High.

As we start porting more and more components and want to run large scale tests against production pages, we will run into this requirement quickly since we don't yet have a way to test it with core. So, good to get this done sooner than later.

Sat, Mar 23, 2:25 PM · Parsoid-PHP
ssastry closed T215000: Fill gaps in PHP DOM's functionality as Resolved.

This functionality has now been merged into the Parsoid codebase and we are starting to use it. Thanks @Tgr and @cscott. Exposing this to user outside Parsoid would be a separate task.

Sat, Mar 23, 2:20 PM · Patch-For-Review, Parsoid-PHP
ssastry closed T216102: Determine which PHP version to target with Parsoid as Resolved.

We are targeting 7.2

Sat, Mar 23, 2:18 PM · Patch-For-Review, Parsoid-PHP
ssastry moved T217867: Port domino (or another spec-compliant DOM library) to PHP from Backlog to Non-Porting Tasks on the Parsoid-PHP board.
Sat, Mar 23, 2:18 PM · Core Platform Team Backlog (Attic), Parsoid-PHP
ssastry moved T218183: Audit uses of PHP DOM in Wikimedia software from Backlog to Non-Porting Tasks on the Parsoid-PHP board.
Sat, Mar 23, 2:18 PM · TechCom, MediaWiki-General-or-Unknown, Parsoid-PHP
ssastry moved T219069: Reconcile byte offsets coming from Tokenizer with unicode char offsets used by rest of ported code from Backlog to Porting on the Parsoid-PHP board.
Sat, Mar 23, 2:16 PM · Parsoid-PHP
ssastry moved T219071: Set up hybrid JS testing runs in CI from Backlog to Testing / QA / Deployment on the Parsoid-PHP board.
Sat, Mar 23, 2:16 PM · Parsoid-PHP
ssastry moved T219072: Extend JS/PHP hybrid testing to other Parsoid components from Backlog to Testing / QA / Deployment on the Parsoid-PHP board.
Sat, Mar 23, 2:16 PM · Parsoid-PHP
ssastry triaged T219072: Extend JS/PHP hybrid testing to other Parsoid components as Normal priority.
Sat, Mar 23, 2:16 PM · Parsoid-PHP
ssastry triaged T219071: Set up hybrid JS testing runs in CI as High priority.
Sat, Mar 23, 2:12 PM · Parsoid-PHP
ssastry created T219071: Set up hybrid JS testing runs in CI.
Sat, Mar 23, 2:12 PM · Parsoid-PHP
ssastry renamed T219069: Reconcile byte offsets coming from Tokenizer with unicode char offsets used by rest of ported code from Reconcile byte offsets coming from Tokenizer with unicode char offsets used by rest of porte code to Reconcile byte offsets coming from Tokenizer with unicode char offsets used by rest of ported code.
Sat, Mar 23, 2:09 PM · Parsoid-PHP
ssastry created T219069: Reconcile byte offsets coming from Tokenizer with unicode char offsets used by rest of ported code.
Sat, Mar 23, 2:08 PM · Parsoid-PHP

Fri, Mar 22

ssastry added a comment to T213493: Install PHP7 on scandium.

Thanks! :-)

Fri, Mar 22, 3:30 PM · Patch-For-Review, Operations, Parsoid-PHP

Thu, Mar 21

ssastry updated subscribers of T216584: Consider deprecating and removing public data-parsoid REST endpoint.
Thu, Mar 21, 3:48 PM · Core Platform Team Backlog (Later), Services (designing), RESTBase
ssastry added a comment to T216584: Consider deprecating and removing public data-parsoid REST endpoint.

Google had asked us about what dsr information means because they were using it for some reason and I added https://www.mediawiki.org/wiki/Parsoid/Internals/data-parsoid to document it with lots of caveats that we will and can change internals without warning. But worth at least giving them a heads up by tagging the relevant phab accounts on this ticket.

Mind tagging them @ssastry ? Thnx!

Thu, Mar 21, 3:44 PM · Core Platform Team Backlog (Later), Services (designing), RESTBase
ssastry added a comment to T213493: Install PHP7 on scandium.

@ssastry app servers like mwdebug have 7.0 and 7.2 but looking at a prod parsoid server like wtp1025 there is also 7.0 there. It should match wtp servers, right?

Thu, Mar 21, 3:25 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry added a comment to T213493: Install PHP7 on scandium.

Oh 7.0? Isn't production on PHP 7.1 / 7.2?

Thu, Mar 21, 2:14 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry added a comment to T213493: Install PHP7 on scandium.

Oh 7.0? Isn't production on PHP 7.1 / 7.2?

Thu, Mar 21, 2:08 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry added a comment to T216584: Consider deprecating and removing public data-parsoid REST endpoint.

Google had asked us about what dsr information means because they were using it for some reason and I added https://www.mediawiki.org/wiki/Parsoid/Internals/data-parsoid to document it with lots of caveats that we will and can change internals without warning. But worth at least giving them a heads up by tagging the relevant phab accounts on this ticket.

Thu, Mar 21, 2:44 AM · Core Platform Team Backlog (Later), Services (designing), RESTBase

Wed, Mar 20

ssastry added a comment to T114445: [RFC] Balanced templates.

Moving to backlog as current status is unclear.

If the RFC has a clear desired outcome or problem statement, and resourcing commitment from a team that is interested in wider feedback, input or approval, then move it to the Inbox to let TechCom know :)

Wed, Mar 20, 9:26 PM · Parsing-Team, Patch-For-Review, TechCom-RFC

Tue, Mar 19

ssastry changed the status of T218358: Add data-title attribute to anchors from Open to Stalled.

Thanks! So, I am going to mark this stalled. But, if priorities change and you need this sooner, please update this ticket accordingly.

Tue, Mar 19, 6:31 PM · Readers-Web-Backlog (Tracking), Parsing-Team, MediaWiki-Parser, Internet-Archive, Parsoid, Technical-Debt

Mon, Mar 18

ssastry added a comment to T218358: Add data-title attribute to anchors.

@ssastry, this task is about improving the current implementation in Popups and now Minerva which have JavaScript client-side implementations. As far as I know, both use the PHP parser presently and there's no rush to replace the client-side hacks. I just wanted to track the work since 1) @Krinkle identified this improvement in code review and 2) dropping the client side implementations would hopefully drop a bunch of "guesswork" code.

Mon, Mar 18, 10:54 PM · Readers-Web-Backlog (Tracking), Parsing-Team, MediaWiki-Parser, Internet-Archive, Parsoid, Technical-Debt
ssastry added a comment to T218358: Add data-title attribute to anchors.

My understanding is that if the link is marked with mw:wikilink you can just infer the title from the relative path. Is there a case where this wouldn't work?
VE does this here: https://github.com/wikimedia/mediawiki-extensions-VisualEditor/blob/8846e65e4475fb1942e2d30478d072b5261cd4a7/modules/ve-mw/ve.MWutils.js#L163

I think this would cover the use case in Popups and Minerva. @ssastry, can you confirm that using mw:wikilink anchors with @Esanders' regular expression is safe?

Mon, Mar 18, 10:37 PM · Readers-Web-Backlog (Tracking), Parsing-Team, MediaWiki-Parser, Internet-Archive, Parsoid, Technical-Debt

Fri, Mar 15

ssastry added a comment to T218358: Add data-title attribute to anchors.

We are right now heads down in porting Parsoid to PHP and would prefer not to undertake any development on the JS codebase unrelated to the porting itself. So, clarity around priority would be helpful for us to figure out how and when to engage with this request.

Fri, Mar 15, 10:07 PM · Readers-Web-Backlog (Tracking), Parsing-Team, MediaWiki-Parser, Internet-Archive, Parsoid, Technical-Debt

Thu, Mar 14

ssastry added a comment to T218330: Table of contents HTML may be unbalanced.

This is flagged for editors via the https://www.mediawiki.org/wiki/Help:Extension:Linter/unclosed-quotes-in-heading category.

Thu, Mar 14, 6:08 PM · Parsing-Team, MediaWiki-Parser
ssastry added a comment to T204595: Evaluate and document performance of RemexHtml vs Domino.

Looks like our numbers are now more in sync after you disabled xdebug. But, here is the script I used.

Thu, Mar 14, 3:30 PM · RemexHtml, Parsoid-PHP

Wed, Mar 13

ssastry added a project to T218183: Audit uses of PHP DOM in Wikimedia software: TechCom.
Wed, Mar 13, 9:46 PM · TechCom, MediaWiki-General-or-Unknown, Parsoid-PHP
ssastry triaged T218183: Audit uses of PHP DOM in Wikimedia software as Normal priority.
Wed, Mar 13, 9:46 PM · TechCom, MediaWiki-General-or-Unknown, Parsoid-PHP
ssastry added a project to T213493: Install PHP7 on scandium: Operations.
Wed, Mar 13, 2:55 AM · Patch-For-Review, Operations, Parsoid-PHP
ssastry triaged T216102: Determine which PHP version to target with Parsoid as High priority.
Wed, Mar 13, 2:55 AM · Patch-For-Review, Parsoid-PHP
ssastry triaged T218112: Create an implementation of Env etc for testing by accessing a MediaWiki installation via the Action API as Normal priority.
Wed, Mar 13, 2:54 AM · Parsoid-PHP
ssastry moved T218112: Create an implementation of Env etc for testing by accessing a MediaWiki installation via the Action API from Backlog to Testing / QA / Deployment on the Parsoid-PHP board.
Wed, Mar 13, 2:54 AM · Parsoid-PHP

Tue, Mar 12

ssastry triaged T218166: Improve Parsoid's understanding of <indicator> extension as Normal priority.
Tue, Mar 12, 10:37 PM · Parsoid-Read-Views

Mon, Mar 11

ssastry removed a project from T18700: Nesting templates lead to excess whitespace: Parsing-Team.
Mon, Mar 11, 5:13 PM · Core Platform Team Backlog (Watching / External), Parser, MediaWiki-Parser
ssastry removed a project from T191516: Parser generates broken lists when closing table (td, th, tr, table) tags are on the same line as the list item: Parsing-Team.
Mon, Mar 11, 5:13 PM · MediaWiki-extensions-Linter, MediaWiki-Parser
ssastry removed a project from T185695: Support an #open-tag and #close-tag parser function to allow for generation of "unbalanced" HTML and pseudo tags in templates: Parsing-Team.
Mon, Mar 11, 5:13 PM · MediaWiki-Parser
ssastry removed a project from T212543: RemexHtml DOM construction performance increases non-linearly wrt HTML size: Parsing-Team.
Mon, Mar 11, 5:12 PM · Patch-For-Review, Performance, RemexHtml
ssastry closed T214119: Tech Talks Proposal 2019: The long and winding road to making Parsoid the default MediaWiki parser as Resolved.
Mon, Mar 11, 5:11 PM · Developer-Advocacy, Parsing-Team
ssastry closed T214119: Tech Talks Proposal 2019: The long and winding road to making Parsoid the default MediaWiki parser, a subtask of T212978: Wikimedia tech talks and learnings 2019, as Resolved.
Mon, Mar 11, 5:11 PM · Developer-Advocacy, Documentation
ssastry triaged T185910: Implement a linter check for "Unfamiliar or unrecognised tag" as Low priority.
Mon, Mar 11, 5:10 PM · MediaWiki-extensions-Linter
ssastry triaged T215999: Linter does not detect invalid "500px500px" as a bogus file option as Normal priority.
Mon, Mar 11, 5:09 PM · Parsoid-Linter, MediaWiki-extensions-Linter
ssastry triaged T210315: Some lint issues are linked to attributed template without the Template: namespace prefix as Normal priority.
Mon, Mar 11, 5:09 PM · MediaWiki-extensions-Linter
ssastry triaged T216003: Linter fails to detect multiple "upright" parameters as a Bogus file option as Normal priority.
Mon, Mar 11, 5:09 PM · Parsoid-Linter, MediaWiki-extensions-Linter
ssastry triaged T185827: End of line (vs End of paragraph) causes "mis-nesting" Linter error when italics <i></i> applied around entire block as Normal priority.
Mon, Mar 11, 5:09 PM · MediaWiki-extensions-Linter
ssastry moved T185827: End of line (vs End of paragraph) causes "mis-nesting" Linter error when italics <i></i> applied around entire block from Backlog to Parsoid on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:08 PM · MediaWiki-extensions-Linter
ssastry moved T210315: Some lint issues are linked to attributed template without the Template: namespace prefix from Backlog to Parsoid on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:08 PM · MediaWiki-extensions-Linter
ssastry moved T216003: Linter fails to detect multiple "upright" parameters as a Bogus file option from Backlog to Parsoid on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:08 PM · Parsoid-Linter, MediaWiki-extensions-Linter
ssastry moved T215999: Linter does not detect invalid "500px500px" as a bogus file option from Backlog to Parsoid on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:08 PM · Parsoid-Linter, MediaWiki-extensions-Linter
ssastry moved T185910: Implement a linter check for "Unfamiliar or unrecognised tag" from Backlog to New Linters on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:07 PM · MediaWiki-extensions-Linter
ssastry moved T191516: Parser generates broken lists when closing table (td, th, tr, table) tags are on the same line as the list item from Backlog to New Linters on the MediaWiki-extensions-Linter board.
Mon, Mar 11, 5:07 PM · MediaWiki-extensions-Linter, MediaWiki-Parser
ssastry closed T209312: Special:LintErrors on the Hebrew Wikipedia always has the same number for Missing end tag and Obsolete HTML tags as Declined.

See https://www.mediawiki.org/wiki/Topic:Uueiraicppplzvm6 and T194872: Linter : have correct counters for categories populated with only a few errors (or none)

Mon, Mar 11, 5:07 PM · MediaWiki-extensions-Linter
ssastry triaged T194872: Linter : have correct counters for categories populated with only a few errors (or none) as Normal priority.
Mon, Mar 11, 5:05 PM · MediaWiki-extensions-Linter
ssastry triaged T200517: Emit lint error or category when a page uses duplicate HTML IDs as Normal priority.
Mon, Mar 11, 5:04 PM · Patch-For-Review, MediaWiki-extensions-Linter, MediaWiki-Parser
ssastry updated subscribers of T210342: Otrs-wiki lint errors are not being updated.

@Pchelolo is T171788 applicable to otrswki?

Mon, Mar 11, 4:49 PM · MediaWiki-extensions-Linter
ssastry added a comment to T218042: Linter counts number.

That is correct. This can be closed and discussion can continue on that wiki page.

Mon, Mar 11, 4:47 PM · MediaWiki-extensions-Linter
ssastry closed T202905: Outreach-17 Project: Add a new Linter Category: Links-in-Links as Resolved.
Mon, Mar 11, 4:44 PM · MW-1.33-notes (1.33.0-wmf.19; 2019-02-26), Patch-For-Review, Parsoid-Linter, Outreach-Programs-Projects, Outreachy (Round 17), MediaWiki-extensions-Linter
ssastry closed T183746: Provide link to (visual?) diff caused by lint errors as Resolved.
Mon, Mar 11, 4:43 PM · MW-1.32-notes (WMF-deploy-2018-05-01 (1.32.0-wmf.2)), Patch-For-Review, MediaWiki-extensions-Linter
ssastry added a comment to T218042: Linter counts number.

@Adithyak1997 I already replied on that wiki page where you asked the question.

Mon, Mar 11, 4:35 PM · MediaWiki-extensions-Linter

Sun, Mar 10

ssastry added a comment to T210342: Otrs-wiki lint errors are not being updated.

The discussion on this topic: https://www.mediawiki.org/wiki/Topic:Uueiraicppplzvm6 might help clarify.

Sun, Mar 10, 1:36 PM · MediaWiki-extensions-Linter
ssastry added a comment to T217850: Remex could use some helper/utility classes.

More specifically, it seems to be tied to 'skipPreProcess' => true. I haven't looked at why \r handling varies on the presence of this option (i.e. if this is a bug or a feature).

Feature.

Docs and the normalization performed in code.

Sun, Mar 10, 2:51 AM · RemexHtml
ssastry added a comment to T217850: Remex could use some helper/utility classes.

More specifically, it seems to be tied to 'skipPreProcess' => true. I haven't looked at why \r handling varies on the presence of this option (i.e. if this is a bug or a feature).

Sun, Mar 10, 12:54 AM · RemexHtml

Sat, Mar 9

ssastry added a comment to T217850: Remex could use some helper/utility classes.

I was trying to figure out the reason for the difference reported in failing test #1 in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/494253/10//COMMIT_MSG.
I suspected one of (a) file reading (b) Remex parsing (c) XML Serializer.

Sat, Mar 9, 10:49 PM · RemexHtml

Thu, Mar 7

ssastry updated the task description for T217867: Port domino (or another spec-compliant DOM library) to PHP.
Thu, Mar 7, 10:12 PM · Core Platform Team Backlog (Attic), Parsoid-PHP
ssastry triaged T217867: Port domino (or another spec-compliant DOM library) to PHP as Low priority.
Thu, Mar 7, 10:11 PM · Core Platform Team Backlog (Attic), Parsoid-PHP
ssastry added a comment to T217766: Flow\Exception\WikitextException: ParseEntityRef: no name.

All this speaks for us all to consolidate behind a common solution for HTML parsing and DOM manipulation on the PHP side. That could be Remex + the DOMCompat code that is currently in review ( https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/491892 ). One of the original suggestions @cscott made was to release this dom compat code as a composer lib but we just went with it being a Parsoid-internal compat layer for simplicity. But, if there is broader immediate interest, perhaps we could make it a composer lib which everyone can build upon.

Thu, Mar 7, 9:22 PM · MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), Growth-Team (Current Sprint), StructuredDiscussions, Parsoid, Wikimedia-production-error
ssastry added a comment to T211161: Tweaks to genTest option in parse.js.

Bonus points: sync file generation and options between transformTests and domTests .... i.e. both should probably take a common output directory flag and compute the file name from other options (prefix, page, transformer name).

Thu, Mar 7, 8:55 PM · Patch-For-Review, Parsoid-PHP
ssastry reassigned T211161: Tweaks to genTest option in parse.js from Sbailey to Arlolra.
Thu, Mar 7, 8:49 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T213493: Install PHP7 on scandium.

Hi @ssastry We already have ticket T201366 to setup scandium as the replacement for ruthenium. That includes a jessie to stretch upgrade which means PHP7.0, 7.1 or 7.2 could be installed.

Thu, Mar 7, 7:51 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry updated the task description for T213493: Install PHP7 on scandium.
Thu, Mar 7, 7:51 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry renamed T213493: Install PHP7 on scandium from Install PHP7 on ruthenium to Install PHP7 on scandium.
Thu, Mar 7, 7:50 PM · Patch-For-Review, Operations, Parsoid-PHP
ssastry closed T214725: transformTests CI script isn't aborting on failure as Resolved.
Thu, Mar 7, 5:54 PM · Patch-For-Review, Parsoid

Wed, Mar 6

ssastry added a comment to T217766: Flow\Exception\WikitextException: ParseEntityRef: no name.

This is the consequence of a batshit insane bug in PHP's DOMDocument. It'll drop all entity encoding if your attribute value contains something that looks like an HTML comment:

Wed, Mar 6, 11:01 PM · MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), Growth-Team (Current Sprint), StructuredDiscussions, Parsoid, Wikimedia-production-error

Tue, Mar 5

ssastry added a comment to T212982: Create a ParsingEnvironment class for use with Parsoid/PHP.

A wrapper 'env' class around the different configs (site / wiki, parsoid, page) might still be useful so we can pass one object everywhere where needed.

Tue, Mar 5, 5:40 PM · Patch-For-Review, Parsoid-PHP

Mon, Mar 4

ssastry added a comment to T217458: Decide on which unit testing framework (PHPSpec or PHPUnit) to use for Parsoid in the long term.

Alright, I'll make this decision since it seems we have been going round and round a bit.

Mon, Mar 4, 4:33 PM · Continuous-Integration-Config, Parsoid-PHP
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

@Tgr, @Smalyshev, @ori but you cannot guarantee that all references to the deleted node will be removed before you use findElementById. So, this still is not a solution.

Mon, Mar 4, 2:07 PM · Patch-For-Review, Parsoid-PHP
ssastry added a project to T217540: Mobile-Sections returns missing images: Mobile-Content-Service.
Mon, Mar 4, 1:58 PM · Reading-Infrastructure-Team-Backlog, Mobile-Content-Service, Parsoid
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

@Tgr, @Smalyshev, @ori but you cannot guarantee that all references to the deleted node will be removed before you use findElementById. So, this still is not a solution.

<?php
Mon, Mar 4, 1:56 PM · Patch-For-Review, Parsoid-PHP

Sat, Mar 2

ssastry moved T217458: Decide on which unit testing framework (PHPSpec or PHPUnit) to use for Parsoid in the long term from Backlog to Testing / QA / Deployment on the Parsoid-PHP board.
Sat, Mar 2, 6:08 PM · Continuous-Integration-Config, Parsoid-PHP
ssastry moved T216102: Determine which PHP version to target with Parsoid from Backlog to Testing / QA / Deployment on the Parsoid-PHP board.
Sat, Mar 2, 3:39 PM · Patch-For-Review, Parsoid-PHP

Fri, Mar 1

ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

I guess the options are:

  1. wrap getElementById to walk up the parents and see if it is part of the document

    6 seems the least painful, as long as getElementById is not broken in the other direction (where it doesn't return elements it should). Seems like ID handling is broken in more ways than one, though: https://3v4l.org/pi599
Fri, Mar 1, 10:48 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

...actually not, since it also applies to descendants of the removed node (and they'd have to be fixed individually). Filed as https://bugs.php.net/bug.php?id=77686

Fri, Mar 1, 3:23 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

New sadness: https://3v4l.org/UQYTG
Apparently the id => element map is not always updated when the DOM changes.

Fri, Mar 1, 3:05 PM · Patch-For-Review, Parsoid-PHP

Wed, Feb 27

ssastry added a comment to T215002: New paragraph before section heading becomes line break.

But you do not need <br> in wikitext to emit empty paragraphs in HTML? You just need newlines. I don’t understand why the <br> tags end up in wikitext.

Wed, Feb 27, 2:47 AM · VisualEditor (Current work), Parsoid

Tue, Feb 26

ssastry added a comment to T215002: New paragraph before section heading becomes line break.

I know about that task (I even linked it here). There should be <br /> generated in the HTML output for multiple newlines in wikitext. There must not be any in the wikitext output for multiple empty paragraphs in HTML, though.

Tue, Feb 26, 9:21 PM · VisualEditor (Current work), Parsoid
ssastry added a comment to T215002: New paragraph before section heading becomes line break.

It's reproducible. I just added an empty paragraph between the lead and section heading in this edit: https://en.wikipedia.org/w/index.php?title=User:Matma_Rex/sandbox&diff=885235533&oldid=885235196&diffmode=source

There should be no <br /> in the wikitext output. Only a bunch of newline characters.

Tue, Feb 26, 8:58 PM · VisualEditor (Current work), Parsoid
ssastry added a comment to T215002: New paragraph before section heading becomes line break.

CC @ssastry does this look like a Parsoid issue?

Tue, Feb 26, 8:27 PM · VisualEditor (Current work), Parsoid

Mon, Feb 25

ssastry updated subscribers of T216636: Consider deprecating section editing API in RESTBase.

@Esanders thoughts?

Mon, Feb 25, 10:11 PM · Core Platform Team Backlog (Later), Services (designing), VisualEditor, RESTBase
ssastry added a comment to T217093: Cannot read property 'docId' of undefined.

In any case, @cscott, this is a testing gap currently.

Lots of tests mocha were introduced in,
https://github.com/wikimedia/parsoid/commit/4f298233cc95b823c34d605a08f109cf5b1a4157

but it looks like this only would have been caught if the DOMTraverser was given something with encapsulation,
https://github.com/wikimedia/parsoid/blob/master/lib/utils/DOMTraverser.js#L122

Mon, Feb 25, 10:07 PM · Patch-For-Review, Parsoid
ssastry updated subscribers of T217093: Cannot read property 'docId' of undefined.

So, our pre-deploy testing has a hole for lang variants since rt testing doesn't cover this feature. Presumably, this would have been caught on beta cluster but that is currently readonly. In any case, @cscott, this is a testing gap currently.

Mon, Feb 25, 9:49 PM · Patch-For-Review, Parsoid
ssastry added a comment to T213228: Implement class (name to be decided) to provide access to preprocessor, extensions, and other metadata needed by Parsoid.
  • No need to use promises since all requests will be synchronously satisfied. Is there a reason you see for using promises for potential future async modes in PHP?

The only reason I asked about promises was because not using promises will result in structural changes to the calling code, hopefully just directly calling the result-processing code instead of passing it as a callback to .then(). I wasn't sure of the extent to which we wanted to do that during the port.

Mon, Feb 25, 8:01 PM · Patch-For-Review, Parsoid-PHP

Sun, Feb 24

ssastry added a comment to T213228: Implement class (name to be decided) to provide access to preprocessor, extensions, and other metadata needed by Parsoid.

A few comments:

  • No need to use promises since all requests will be synchronously satisfied. Is there a reason you see for using promises for potential future async modes in PHP?
  • Currently, usePHPPreProcessor is false for parser test runs in Parsoid. It uses the Parsoid-native template expansion and parser functions code. That has always existed since the beginning, but had too many edge case bugs and performance wasn't considered good enough to actually develop it further. But, using mediawiki api for parser tests runs would have been excessive given how frequently they run and it would have slowed down parser test runs greatly. But, the qn of what to do now with integration with Parsoid remains. I don't see a reason to not use the same core preprocessing code for everything (including parser tests).
  • We need memoization support for sure for perf reasons. But, we probably do not need batching of template and extension requests any more.
  • But, we do need batch query support to avoid hitting the db for every single image/link. The "redlinks" and "mediainfo" dom passes make these bulk metadata requests. So, we need support for these bulk/batch requests.
  • As for linting, yes, you could use any existing interfaces provided by SiteConfig.
Sun, Feb 24, 6:09 PM · Patch-For-Review, Parsoid-PHP

Feb 22 2019

ssastry edited projects for T216850: Discrepancy between documentation and reality, added: MediaWiki-API; removed Parsoid.
Feb 22 2019, 10:17 PM · MediaWiki-API

Feb 21 2019

ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.
  • attribute name with or without value (with quotes, possibly with case insensitive flag), = / ~=)

If Parsoid needs case-insensitive attribute selectors, and we use css-sanitizer for parsing as you did in your patch, then we'll need to update css-sanitizer to support Selectors Level 4 (it currently only supports Level 3).

We should probably do that anyway at some point, since https://caniuse.com/#feat=css-case-insensitive shows decent browser support. This would just raise the priority of doing that.

Feb 21 2019, 8:11 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T214099: Stress test Parsoid's HTTP API.

Parsoid was in a bit of trouble again today. At 02:44 XioNoX: depool eqsin, so all the mobile traffic for Chinese wiki started hitting RESTBase. Since Chinese has variants, RB started requesting Parsoid to transform HTML into correct variant. As the transformation request rate reached roughly 40 r/s, Parsoid started experiencing troubles, alerting and timing out.

I would like to prioritize this issue, even though node Parsoid is being replaced, depooling an edge CDN DC should not be able to bring down our infrastructure. Also, 40 r/s for lang variant transforms seem quite low to affect Parsoid cluster so drastically. I believe there's a bug somewhere.

Feb 21 2019, 4:30 AM · Patch-For-Review, Services (watching), Parsoid-Web-API, Parsoid

Feb 20 2019

ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

It would. Object creation is pretty cheap in PHP though.

Feb 20 2019, 6:26 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

The syntax I'm playing with is along the lines of DomCompat::wrap( $node )->remove(). (Properties would have to be replaced by getters/setters since property getters/setters are a mess in PHP: DomCompat::wrap( $node )->getBody() etc.) That makes the syntax easier to remember/read than static methods.

Feb 20 2019, 6:06 PM · Patch-For-Review, Parsoid-PHP

Feb 19 2019

ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

@subbu it's hard to grep for add! These are generally manipulations of Element#classList. See git grep classList which probably pulls up most of them.

Feb 19 2019, 3:59 PM · Patch-For-Review, Parsoid-PHP
ssastry added a comment to T215000: Fill gaps in PHP DOM's functionality.

Here's an inventory of all domino top-level methods (that is, not those called internally w/in the domino implementation) invoked by Parsoid when running npm run parserTests. The first column is the number of dynamic invocations, which isn't so interesting. What would be more interesting is the number of different callsites within Parsoid; I'll try to generate that next.

I've added a third column which lists the equivalent PHP function, if any.

25500	DOMTokenList#add	-
14375	DOMTokenList#contains	-
8028	DOMTokenList#length[get]	-
8	DOMTokenList#remove	-
Feb 19 2019, 3:56 PM · Patch-For-Review, Parsoid-PHP
ssastry closed T208901: TemplateStyles breaks a paragraph if a file is inserted inline as Resolved.

Yes.

Feb 19 2019, 3:32 PM · Patch-For-Review, Core Platform Team Kanban (Done with CPT), MW-1.33-notes (1.33.0-wmf.16; 2019-02-05), Parsoid, TemplateStyles, MediaWiki-Parser