The use of an invalid title here was intentional, although that was with the idea that it would never make it into the queue. Recent work has removed the mandatory existence of a title parameter. Any jobs passing them through the main signature as before are normalized to set the title as a regular params key.
This is caused by https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/500171/ and specifically https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/500171/14/includes/jobqueue/Job.php where it sets the default title to Title::makeTitle( NS_SPECIAL, '').
Wed, Apr 17
Agreed, let's do it!
Mon, Apr 15
Let's first execute in deployment-prep and follow into production after we merge/deploy https://github.com/wikimedia/restbase/pull/1117 in beta to verify all good and correct?
Sun, Apr 14
needed to do it before moving to key-value refactoring.
Fri, Apr 12
@akosiaris no, spanning up a new worker takes no time, the problem here is actually hilling old worker.
Thu, Apr 11
oh yes, it totally is a followup. Thank you.
I've also had an idea to create a limited concurrency consumer for change-prop use-case, eg T206186
Wed, Apr 10
Hm, the sendmsg that happened before a 5-second delay was for port 9125, which is statsd_exporter.
These 2 heartbeats are more than 10s apart:
We have been splitting the traffic (thus load testing Proton on real traffic) for a long time now, so the switch by now is only switching the content that's actually served to the clients. I do not think it worths making a multi-stage deployment here.
However, I agree that enabling jobs in production might be premature, we can probably start experimenting in beta cluster. However, we'd need to resolve T215339 ASAP
A job doesn't offer a way with retrying when they fail.
A couple of thouths
Tue, Apr 9
Sorry about that. Fixed by above patch.
Fri, Apr 5
Oh hell @bmansurov we do have x-amples tests... The patch will enable them.
As a followup - should we enable some x-amples checks for this endpoint?
Thu, Apr 4
Apparently graphoid is still using service::node::config and not the config template in the deployment repo. Given that graphoid will be switched to k8s soon, should we just postpone this until the switch, move graphoid to deployment-repo config or do the puppet work to enable rsyslog in service::node::config? What do you think @akosiaris @fgiunchedi @mobrovac ?
ChangeProp and JobQueue ChangeProp has been moved to the new logging infra as well.
Oh, awesome. Thank you!
Wed, Apr 3
The patch above should fix it.
Tue, Apr 2
I have deployed a new pipeline for RESTBase in production and it all looks great. Next step - convert other services. I will try it out on change-prop and create subtasks for individual services.
If you could verify on Beta prior to deployment, however, that would be helpful.
The new UI has been deployed. Next step here - explore the new features in openAPI 3.0, see what we can start using, converting the specs into 3.0.
The https://gerrit.wikimedia.org/r/500363 fixes it. Don't want to self-merge my own patch though.
Sun, Mar 31
Thank you for catching this early and sorry for this.
Fri, Mar 29
If I understand correctly, in order to switch a particular job execution to PHP7 all we need to do is to add Cookie: PHP_ENGINE=php7 header to the request.
I think I have failed to describe the details of the basis of the reasoning behind this leaving a lot of room for confusion. I'll try to fix this mistake.
Thu, Mar 28
We have deployed the partitioner for the htmlCacheUpdate job and it's not running in production. We have created some lag in the process, but it should clear out soon.
Wed, Mar 27
Tue, Mar 26
After the patch was deployed we do not have nulls in recent change schema anymore, however we still can not declare victory and get rid of all of the polymorphic types in the schema. The log_params can be either an object or an array and, judging by the code, it can actually be a non-empty array in rare cases. Not sure what to do about that.
Actually, the existing topic need to be left alone, but 2 new topics 8 partitions each needs to be created:
Mon, Mar 25
Merged and deployed as a part of SWAT. Resolving.
For step 2 we need to switch hyperswitch to upstream swagger.
@Ottomata yes, but not just yet, we still need to prepare the patches etc.
Fri, Mar 22
Thu, Mar 21
There is already an ability to execute jobs after a delay or at more-or-less specific time, but it's really not something we want to build on.
Now it's ready - CP tests are independent of both Kafka and Redis.
Mar 20 2019
mediawiki-vagrant should also be updated to support new proton role.
Oh, no, not resolving yet. Next step - mock redis.
The PR has been merged, resolving
Mar 19 2019
Can this be resolved?
The rev_content_changed has been removed from the schema and after the train we will ensure rev_parent_id is present in all the events. Resolving.
Mar 18 2019
I think that the schema is incorrect here.
After enabling logging over syslog for RESTBase in deployment-prep, we have identified a number of disparities between node services and, for example, mediawiki.
Mar 15 2019
Mar 14 2019
Verified that we can work with swagger-ui 3+ once we make the spec standard-compliant. Let's begin with modifying the specs.
I wonder if we should also use ?hasty=true mode for mediawiki 'analytics' events? This would use a non-ACKed producer and not ever block the MW waiting for a response.
Oh, we have made use of the hot-shots internal childClient method, but forgot there's a debugging LogStatsD. Need to fix this in service-runner and make sure RESTBase starts if we configure metrics.type to 'log'.