This is a result of how we're treating the storage of Parsoid data now. Older revisions are not stored anymore unless you provide ?stash=true parameter and a matching html/data-parsoid can only be fetched if you provide a TID.
Sat, May 18
Thu, May 16
Since I have finished refactoring and reshuffling code in parsoid.js module this can now be picked up and worked on.
Wed, May 15
Currently we also are supplying original HTML and Data-Parsoid for wikitext/to/html transformations. This caused a bug in RESTBase when the transformation was failing if the original was not present and attempt to fetch it was failing with a 404. Currently, the optimization is disabled in Parsoid, we will stop fetching the original to save some roundtrips to cassandra. When(if) the optimization is reintroduced, we will need to start fetching the original again, but make sure we gracefully handle the case when it's not there.
The definitions docs had incorrect referencing and apparently swagger 3 is not as chill about it as swagger 2, https://github.com/wikimedia/restbase/pull/1133 fixes it
Tue, May 14
Good point. So, we need to replace the .length with Buffer.byteLength with 'utf8' encoding and provide some tests.
I believe https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/503651 is the reason.
What is this mediawiki.api-request data ultimately for and why does it require a UUID?
Mon, May 13
The caveat would be we would have to update all dashboards for all services residing on the same host cluster (scb being the problem here) pretty much on the same timeframe, mostly due to the fact statsd configuration is the same for all services on the same host.
The patch has been SWATted, so now VE provides an appropriate query parameter and the responses are not cached. This particular one is done. Resolving.
Thank you for an impressive level of details :) There's a bunch of other places where we abuse the timing metric within services exactly for the reason that we've needed to have percentiles, so the decision we make here should probably be adopted elsewhere.
Sun, May 12
We have been using UUID v1 in EventBus events for a while now with no problem, but I guess for other events it did not really matter since they're much lower volume and are created within already very heavy code paths.
Thu, May 9
Clearly, the links are incorrect, the /api/rest_v1/ is missing. I think it's related to how we're replacing paths in swaggerUI.js in hyperswitch. @Clarakosi was working on it a lot recently, she should be able to have a look.
Tue, May 7
It's is still the case sometimes. It's not severe, but we actually have a hack to insert an artificial meta tag with an article TID into Parsoid HTML in order to workaround this issue. It was added as temporary workaround many years ago and is becoming a permanent workaround, which is not good.
Mon, May 6
(a) this should actually be documented in the API docs (swagger?) alongwith the implications of what happens if a client doesn't comply
changeprop has not been moved to k8s, so no, it can not be marked as resolved.
Thu, May 2
Tue, Apr 30
Mon, Apr 29
After adding the warn logging, there were 15 requests recorded for getSections all in a span of a few minutes, from a browser, so I believe that was someone playing with it from the docs UI. I think we can safely remove the endpoints and the code.
I believe that this can be closed after we are easily sustaining almost 7k events per second in prod.
Tue, Apr 23
Seems like we have a consensus on the backend. @Esanders could you please confirm that VE will not need this in future?
@Milimetric The entry point for generating the tags change events is located at https://github.com/wikimedia/mediawiki-extensions-EventBus/blob/master/includes/EventBusHooks.php#L533 if you wanna debug it yourself
Apr 18 2019
The use of an invalid title here was intentional, although that was with the idea that it would never make it into the queue. Recent work has removed the mandatory existence of a title parameter. Any jobs passing them through the main signature as before are normalized to set the title as a regular params key.
This is caused by https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/500171/ and specifically https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/500171/14/includes/jobqueue/Job.php where it sets the default title to Title::makeTitle( NS_SPECIAL, '').
Apr 17 2019
Agreed, let's do it!
Apr 15 2019
Let's first execute in deployment-prep and follow into production after we merge/deploy https://github.com/wikimedia/restbase/pull/1117 in beta to verify all good and correct?
Apr 14 2019
needed to do it before moving to key-value refactoring.
Apr 12 2019
@akosiaris no, spanning up a new worker takes no time, the problem here is actually hilling old worker.
Apr 11 2019
oh yes, it totally is a followup. Thank you.
I've also had an idea to create a limited concurrency consumer for change-prop use-case, eg T206186
Apr 10 2019
Hm, the sendmsg that happened before a 5-second delay was for port 9125, which is statsd_exporter.
These 2 heartbeats are more than 10s apart:
We have been splitting the traffic (thus load testing Proton on real traffic) for a long time now, so the switch by now is only switching the content that's actually served to the clients. I do not think it worths making a multi-stage deployment here.
However, I agree that enabling jobs in production might be premature, we can probably start experimenting in beta cluster. However, we'd need to resolve T215339 ASAP
A job doesn't offer a way with retrying when they fail.
Variables can be defined based on the response of a request (how, exactly? specifying a patch into a json structure?). This can be done within a test case to supply a CSRF token, or a global fixture to supply the ID of a user or page to test cases, etc.
Apr 9 2019
Sorry about that. Fixed by above patch.
Apr 5 2019
Oh hell @bmansurov we do have x-amples tests... The patch will enable them.
As a followup - should we enable some x-amples checks for this endpoint?
Apr 4 2019
Apparently graphoid is still using service::node::config and not the config template in the deployment repo. Given that graphoid will be switched to k8s soon, should we just postpone this until the switch, move graphoid to deployment-repo config or do the puppet work to enable rsyslog in service::node::config? What do you think @akosiaris @fgiunchedi @mobrovac ?
ChangeProp and JobQueue ChangeProp has been moved to the new logging infra as well.
Oh, awesome. Thank you!
Apr 3 2019
The patch above should fix it.