User Details
- User Since
- May 5 2020, 11:24 AM (310 w, 6 d)
- Availability
- Available
- IRC Nick
- nemo-yiannis
- LDAP User
- Jgiannelos
- MediaWiki User
- JGiannelos (WMF) [ Global Accounts ]
Thu, Apr 16
Wed, Apr 15
Thu, Apr 9
This might be related:
Closing this one after: https://phabricator.wikimedia.org/T422394
Wed, Apr 8
I just tested the new env with increased resources with the same setup and double the concurrency (8 test workers, before it was 4) and i am getting similar amount of errors. Maybe we should try to bump up the resources a bit more. For reference the current testing env running with 20 roundtrip testing workers.
Sounds good.
Tue, Apr 7
I am bumping the priority to high since its blocking code merges.
Mon, Apr 6
Wed, Apr 1
Update on the status of parsoid testreduce for rt-testing:
Tue, Mar 31
For debugging purposes I tried this patch and it renders the output like legacy, because it allows expanding the templates from invoke:
diff --git a/src/Wt2Html/TT/TemplateHandler.php b/src/Wt2Html/TT/TemplateHandler.php index 531a6d4ae..16fbfca2c 100644 --- a/src/Wt2Html/TT/TemplateHandler.php +++ b/src/Wt2Html/TT/TemplateHandler.php @@ -1058,7 +1058,7 @@ class TemplateHandler extends XMLTagBasedHandler { [ // Template like content returned from the // preprocessor should not be further expanded - 'expandTemplates' => false, + 'expandTemplates' => true, 'srcOffsets' => $srcOffsets, ] + $this->options
Mon, Mar 30
I did some more debugging here. The reproduction steps for a simpler example is the following:
Fri, Mar 27
I can cleanup the stale instances. Also might be a good idea to remove the maps cloudvps project that is unused.
Thu, Mar 26
We checked wth CTT team and it doesn't look like anyone knows how MW is maintained on beta. Ideally the maintainers of beta can change the kartotheiran config to point to production maps endpoint.
Should be working now. I just deployed the workaround to prod.
This might be the patch causing the issue:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1255899
The parsing fails when pcs is trying to:
element.style.getPropertyValue("width")I can reproduce this locally too:
curl localhost:8888/en.wikipedia.org/v1/page/mobile-html/Obama
{"status":500,"type":"internal_error","title":"TypeError","detail":"Cannot read properties of undefined (reading 'type')","method":"GET","uri":"/en.wikipedia.org/v1/page/mobile-html/Obama"}This is the error I see:
{
"name": "mobileapps",
"hostname": "mobileapps-staging-7db876b6f9-kb6rm",
"pid": 1,
"level": "ERROR",
"message": "500: internal_error",
"stack": "TypeError: Cannot read properties of undefined (reading type)\n at TokenStream.LA (/srv/service/node_modules/domino/lib/cssparser.js:810:30)\n at TokenStream.advance (/srv/service/node_modules/domino/lib/cssparser.js:683:20)\n at Parser._readDeclarations (/srv/service/node_modules/domino/lib/cssparser.js:3402:42)\n at Parser.parseStyleAttribute (/srv/service/node_modules/domino/lib/cssparser.js:3581:22)\n at parseStyles (/srv/service/node_modules/domino/lib/CSSStyleDeclaration.js:23:10)\n at CSSStyleDeclaration.get (/srv/service/node_modules/domino/lib/CSSStyleDeclaration.js:38:28)\n at CSSStyleDeclaration.value (/srv/service/node_modules/domino/lib/CSSStyleDeclaration.js:94:17)\n at Function.value (/srv/service/pagelib/build/wikimedia-page-library-transform.js:1:658)\n at Function.value (/srv/service/pagelib/build/wikimedia-page-library-transform.js:1:1467)\n at Object._e [as convertImageToPlaceholder] (/srv/service/pagelib/build/wikimedia-page-library-transform.js:1:25056)\n at MobileHTML.finalizeStep (/srv/service/lib/mobile/MobileHTML.js:191:13)\n at MobileHTML.finalizeFor (/srv/service/lib/html/DocumentWorker.js:99:15)\n at /srv/service/lib/html/DocumentWorker.js:35:37\n at MobileHTML._doWorkInChunks (/srv/service/lib/html/DocumentWorker.js:126:8)\n at Immediate.<anonymous> (/srv/service/lib/html/DocumentWorker.js:35:10)\n at process.processImmediate (node:internal/timers:485:21)",
"status": 500,
"type": "internal_error",
"detail": "Cannot read properties of undefined (reading type)",
"request_id": "15dae054-9fe3-4ffc-8a11-538b5197c3de",
"request": {
"url": "/en.wikipedia.org/v1/page/mobile-html/Obama",
"headers": {
"user-agent": "curl/7.74.0",
"x-request-id": "15dae054-9fe3-4ffc-8a11-538b5197c3de"
},
"method": "GET",
"params": {
"0": "/en.wikipedia.org/v1/page/mobile-html/Obama"
},
"query": {},
"remoteAddress": "127.0.0.1",
"remotePort": 37636
},
"levelPath": "error/500",
"msg": "500: internal_error",
"time": "2026-03-26T13:03:50.269Z",
"v": 0
}It doesn't look like a production specific error, because staging also fails.
jgiannelos@deploy2002:~$ curl -k https://staging.svc.eqiad.wmnet:4102/en.wikipedia.org/v1/page/mobile-html/Obama
{"status":500,"type":"Internal error"}
Wed, Mar 25
Tue, Mar 24
Mon, Mar 23
Mar 20 2026
On a different note (cc @ihurbain) since we barely do any development in kartotherian, there is no need for maps on beta. Can we point mw on beta to use maps in prod ?
I think to bring again deployment-prep up to speed with prod we need:
- A new node for tegola (no need for caching)
- A new node for kartotherian
- A new node for postgres/osm (no need for replication, just an initial bootstrap is enough)
I think that the issue here is that kartotherian used to be deployed using puppet on mapsXXXX nodes but we moved it to k8s so now puppet doesn't configure the node kartotherian service.
Mar 18 2026
I just deployed the changes.
Feb 19 2026
This should be enabled now.
Feb 17 2026
Here is a minimal example:
{|
|-
| bgcolor={{ #ifeq: false | true | "gold" | "#f9f9f9" }} | example
|}Feb 16 2026
Regarding 2: I think we need the ability to run maintenance scripts for debugging purposes.
Regarding 4: We can put some instrumentation in our testing script that checks out a specific version of mediawiki, same way we are doing it for parsoid.
Feb 13 2026
Regarding 1: It shouldn't receive any requests from RB because RB is only running mathoid. Is the traffic so low that could be healthchecks from x-amples ?
Feb 12 2026
I have a scenario that might be interesting where there is a combination of parsoid, rest API and CDN purges cascading effect that I am trying to reproduce. I believe that its worth checking what happens when:
Feb 10 2026
Heads up: We should make sure that we run rt test to the same cluster we send requests. In the new scenario we will have 1 env in eqiad and 1 in codfw.
Heads up: We should orchestrate a helmfile deploy early in our rt-test script to run a deployment to bring the env up to date with the latest.
Heads up: We should make sure that devs don't rely on home folders as a long term storage for scripts and stuff. Eventually the host node might be recycled.
Feb 9 2026
I just sent a patch. That said we might end up again with stale responses until cache is evicted.
Closing this one for now with wikifeeds and mobileapps running node 22. I think that ideally we should spend some time in the feature to fine tune the sizing of the pods, but for now production doesn't look so problematic.
We already have some bugs after comparing between legacy/parsoid using getTOC maintenance script. Closing this one for now.
Feb 6 2026
This diagram might also be interesting:
https://grafana.wikimedia.org/goto/Aqf_YrHvg?orgId=1
Feb 5 2026
Here is the latest parsoid rollouts. From git log:
Feb 4 2026
In the scope of node22 upgrade it looks like we are good enough. I think @hnowlan might have some insight on the worker/heap limit sizing from the last round of the same incidence we had on node18 -> node20 upgrade.
Overall i believe that:
Feb 3 2026
@Clement_Goubert Can you help me figure out which endpoint wikifeeds should be pointed to, so we stop using rest-gateway directly ?
Feb 2 2026
I can try setting the flags so we reproduce the state of the node 18 env and see how it goes. That said I believe we should reconsider the sizing with the combination of:
It looks like in PCS level we add the headers as expected. Should I go ahead and close the ticket or just untag content transform team and track the rest of the work with this ticket?
Maybe the second request was cached? Need to check the details.
Jan 29 2026
From what I see in page/mobile-html output, references are not truncated. Maybe its an app rendering specific thing? @Seddon
I think the cause is merging this patch:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1208461
