Pchelolo
User

Projects (6)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Jun 24 2015, 10:23 AM (126 w, 2 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Pchelolo

Recent Activity

Today

Pchelolo added a comment to T181291: Separate retry and error topics between JobQueue and normal ChangeProp.

Another approach how to minimize the damage to normal change prop while making a nice and uniform use of the service_name everywhere is to rename normal ChangeProp service from changeprop to change-prop - this will break logging and metrics, but those dashboards are easily fixable.

Fri, Nov 24, 11:36 AM · ChangeProp, Services (doing)
Pchelolo updated subscribers of T181291: Separate retry and error topics between JobQueue and normal ChangeProp.

Ideally the consumer group names should be prepended with the service_name as well just for consistency, but renaming the consumer groups will make us loose all the backlog we have in all the topics and not process it. I don't think consistency is important enough to do that. What do you think @mobrovac ?

Fri, Nov 24, 11:20 AM · ChangeProp, Services (doing)
Pchelolo created T181291: Separate retry and error topics between JobQueue and normal ChangeProp.
Fri, Nov 24, 11:15 AM · ChangeProp, Services (doing)
Pchelolo updated the task description for T175210: Select candidate jobs for transferring to the new infrastucture.
Fri, Nov 24, 10:37 AM · Patch-For-Review, Services (doing), MediaWiki-JobQueue, ChangeProp, Analytics, EventBus, Operations, User-Joe, User-Elukey
Pchelolo updated subscribers of T175210: Select candidate jobs for transferring to the new infrastucture.

The wikibase-UpdateUsagesForPage job sounds like a perfect candidate to be the next one. It's ~220 jobs/s on average over the last month, it was well tested in beta labs and it seems idempotent and it doesn't seem to use any of the advanced JobQueue features like root job deduplication or delayed execution.

Fri, Nov 24, 10:36 AM · Patch-For-Review, Services (doing), MediaWiki-JobQueue, ChangeProp, Analytics, EventBus, Operations, User-Joe, User-Elukey

Yesterday

Pchelolo created T181221: Prepare and test ChangeProp with multi-partition topics.
Thu, Nov 23, 10:55 AM · Services (doing), ChangeProp
Pchelolo added a comment to T157822: Support multiple partitions per topic in EventBus.

After some discussion with @mobrovac we think that it's better to replicate the sharding mapping in the EventBus service instead of providing it together with the event.

Thu, Nov 23, 10:54 AM · EventBus, Analytics, Services (next)
Pchelolo committed rMSCD8bf33bb2a045: [Config] Increase worker_heartbeat_timeout (authored by Pchelolo).
[Config] Increase worker_heartbeat_timeout
Thu, Nov 23, 10:30 AM
Pchelolo added a comment to T181216: Get rid of pointless EnqueueJob usage.

AFAIK, the new queue would be able to deal with that, though I'm not 100% sure on how writes initiated in different DC perform.

Thu, Nov 23, 10:15 AM · Patch-For-Review, MediaWiki-JobQueue, Services (watching)
Pchelolo updated subscribers of T157822: Support multiple partitions per topic in EventBus.

One issue I've encountered is that I can't find an easy way to find out which shard which domain belongs to from the extension code or how to provide this configuration to the EventBus service. @Joe do you know if it's possible to get the domain => shard mapping in the MW extension code somehow?

Thu, Nov 23, 10:01 AM · EventBus, Analytics, Services (next)
Pchelolo created T181216: Get rid of pointless EnqueueJob usage.
Thu, Nov 23, 9:48 AM · Patch-For-Review, MediaWiki-JobQueue, Services (watching)
Pchelolo added a comment to T157822: Support multiple partitions per topic in EventBus.

Per discussion on the JobQueue meeting:

Thu, Nov 23, 9:11 AM · EventBus, Analytics, Services (next)

Wed, Nov 22

Pchelolo added a comment to T176785: Add action api counts to graphite-restbase job.

I'm a bit confused here. Is this task about adding a count for the action API, turning an existing restbase-only count into a restbase+action API count, or adding a restbase+action API count in addition to an existing restbase-only count?

Wed, Nov 22, 5:34 PM · Patch-For-Review, Services (watching), Analytics-Kanban
Pchelolo added a comment to T176785: Add action api counts to graphite-restbase job.

Only concern is that it changes the existing metric for restbase. @mobrovac and @Pchelolo, is that a big deal?

Wed, Nov 22, 12:18 PM · Patch-For-Review, Services (watching), Analytics-Kanban

Mon, Nov 20

Pchelolo created T181007: Investigate backlog in RecordLintJob.
Mon, Nov 20, 10:53 PM · MW-1.31-release-notes (WMF-deploy-2017-11-28 (1.31.0-wmf.10)), Patch-For-Review, MediaWiki-JobQueue, ChangeProp, Services (doing), Parsoid
Pchelolo added a comment to T179684: ChangeProp workers die if they can't connect to redis.

@hashar Puppet reenabled on both change-prop and reds hosts in deployment-prep a a puppet run was maid. I will be working more on this at some point so will disable it back for a while, but probably not today/tomorrow.

Mon, Nov 20, 9:45 AM · Patch-For-Review, Services (later), ChangeProp

Thu, Nov 16

Pchelolo added a comment to T175210: Select candidate jobs for transferring to the new infrastucture.

Out of the IRC discussion we've got 3 candidates for the next migration:

  • wikibase-UpdateUsagesForPage - super high traffic, well tested on beta, but super easy. TODO talk to Wikidata
  • ORESFetchScoresJob - low traffic, quite problematic
  • recentchangesupdate - decent traffic, very high user-visible effect.
Thu, Nov 16, 5:20 PM · Patch-For-Review, Services (doing), MediaWiki-JobQueue, ChangeProp, Analytics, EventBus, Operations, User-Joe, User-Elukey
Pchelolo added a comment to T180682: Investigate ChangeProp memory growth when a rule hits concurrency limit.

Yup. That's why I'm very suspicious about this particular correlation and wanna wait for the RecodLintJob backlog to get cleaned up naturally

Thu, Nov 16, 3:17 PM · ChangeProp, Services (doing)
Pchelolo added a comment to T180682: Investigate ChangeProp memory growth when a rule hits concurrency limit.

Local debugging wasn't that fruitful. I propose to postpone deployment of https://gerrit.wikimedia.org/r/#/c/391801/ until the RecordLintJob backlog disappears (it's going down now) to get proof that this is indeed the reason for the memory growth.

Thu, Nov 16, 1:58 PM · ChangeProp, Services (doing)
Pchelolo moved T180626: Firejail (and cpulimit, if feasable) headless chromium processes from Backlog to watching on the Services board.
Thu, Nov 16, 12:48 PM · Services (next), Proton, Electron-PDFs, Readers-Web-Backlog
Pchelolo created T180682: Investigate ChangeProp memory growth when a rule hits concurrency limit.
Thu, Nov 16, 12:43 PM · ChangeProp, Services (doing)
Pchelolo updated subscribers of T179684: ChangeProp workers die if they can't connect to redis.

I've conducted an experiment in deployment-prep. I've connected ChangeProp directly to deployment-redis06 and generated some extensive load on ChangeProp. Then I've killed a redis instance and observed that there were a bunch of redis connection errors, but a very limited number compared to production. Also the exception was coming from reds client on_error handler, not from the code directly using redis. However, during that time CP completely stopped processing incoming Kafka events. A theory was that redis client waits for connection and stopped the world. I've change the retry policy of reds to never retry and an issue with stopping processing events was gone. So we should try to write a custom retry policy for production.

Thu, Nov 16, 12:40 PM · Patch-For-Review, Services (later), ChangeProp
Pchelolo added a comment to T180017: Timeouts on event delivery to EventBus.

@Ottomata I've reverted your change on kafka1001 as there's some AttributeError: 'NoneType' object has no attribute 'append' I the local logs.

Thu, Nov 16, 7:19 AM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Patch-For-Review, Services (next), Analytics, EventBus

Wed, Nov 15

Pchelolo closed T180600: Set up Redis for ChangeProp in deployment-prep as Resolved.

According to @fgiunchedi the 05 and 06 instances are new Redis instances where MW was not migrated yet, so we have a few days to conduct experiments and can then reuse the instances for normal operation together with media wiki. Resolving.

Wed, Nov 15, 2:51 PM · Services (next), ChangeProp
Pchelolo closed T180600: Set up Redis for ChangeProp in deployment-prep, a subtask of T179684: ChangeProp workers die if they can't connect to redis, as Resolved.
Wed, Nov 15, 2:51 PM · Patch-For-Review, Services (later), ChangeProp
Pchelolo updated subscribers of T180600: Set up Redis for ChangeProp in deployment-prep.

cc @fgiunchedi as he's doing some reds migrations in deployment-prep

Wed, Nov 15, 2:44 PM · Services (next), ChangeProp
Pchelolo added a comment to T180017: Timeouts on event delivery to EventBus.

I've increased request timeout in EventBus extension to 10 seconds to match sync timeout in Kafka, but it did not fix the timeout errors.

Wed, Nov 15, 2:34 PM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Patch-For-Review, Services (next), Analytics, EventBus
Pchelolo updated subscribers of T180600: Set up Redis for ChangeProp in deployment-prep.

I can see there's deployment-redis05 and deployment-redis06 in deployment-prep, but I don't see any references to these instances anywhere. @Joe @elukey do you know if these are unused reds instances? I've checked if they have any keys and these 2 nodes have completely empty reds installations. Can I just use them?

Wed, Nov 15, 2:06 PM · Services (next), ChangeProp
Pchelolo created T180600: Set up Redis for ChangeProp in deployment-prep.
Wed, Nov 15, 1:46 PM · Services (next), ChangeProp
Pchelolo added a comment to T179684: ChangeProp workers die if they can't connect to redis.

One more piece of info - page_edit rule stopped processing completely after midnight on Nov 14 - there was one more instance of all workers dying at midnight after first an huge amount of reds connection logs were emitted and then KafkaConsumer not connected errors on scb1002.

Wed, Nov 15, 1:18 PM · Patch-For-Review, Services (later), ChangeProp
Pchelolo updated the task description for T180591: Update node-rdkafka to v2.
Wed, Nov 15, 12:43 PM · Analytics, Services (later)
Pchelolo created T180591: Update node-rdkafka to v2.
Wed, Nov 15, 12:42 PM · Analytics, Services (later)
Pchelolo added a comment to T179958: Clean up retry-retry Kafka topics.

Here's the list of topics that should be deleted:

Wed, Nov 15, 12:22 PM · Services (next), EventBus, Analytics
Pchelolo created P6319 Topics to delete from Kafka .
Wed, Nov 15, 12:21 PM
Pchelolo added a comment to T179421: Migrate revisions and restrictions from legacy to new storage.

Script used to create yaml for title_revisions-ng table:

Wed, Nov 15, 11:48 AM · RESTBase-Cassandra, RESTBase, Services (doing)
Pchelolo created P6318 title_revisions-ng yaml.
Wed, Nov 15, 11:47 AM
Pchelolo created P6317 title_revisions-ng script.
Wed, Nov 15, 11:46 AM
Pchelolo closed T179786: Update trending-edits' node-rdkafka to v1.x as Resolved.

Deployed. Resolving.

Wed, Nov 15, 10:11 AM · Patch-For-Review, User-Joe, Operations, Wikimedia-Incident, Reading-Infrastructure-Team-Backlog (Kanban), User-Jdlrobson, Trending-Service, Services (watching)
Pchelolo raised the priority of T130862: Discerning Cassandra instance in Logstash/Kibana from Low to Normal.

Raising the priority since today during the investigation of T180568 this issue have been raised again. In case only one instance per node is misbehaving it's possible to use log stash, but if there would be 2 misbehaving per node that would make log stash absolutely useless.

Wed, Nov 15, 9:21 AM · Services (later), Cassandra, RESTBase-Cassandra

Tue, Nov 14

Pchelolo added a comment to T180420: Errors in production after 11/13 mobileapps deployment.

restbase-dev1006 is a part of our dev cluster. WE are not currently using it however there's a change-prop instance running there - that's the source of errors from there. I'll stop change-prop in dev cluster

Tue, Nov 14, 2:50 PM · Patch-For-Review, Mobile-Content-Service, Reading-Infrastructure-Team-Backlog
Pchelolo added a comment to T180384: Turn off Trending Service.

I would say having a marker (the "experimental" one, or a similar one) and setting expectations to be "it may disappear or its API may change at any point in time without notice" would be the way to go here.

Tue, Nov 14, 12:23 PM · Operations, Services (designing), Reading-Infrastructure-Team-Backlog (Kanban), Trending-Service
Pchelolo created T180442: Export burrow metrics to prometheus.
Tue, Nov 14, 10:39 AM · User-Elukey, Analytics, ChangeProp, Services (watching), EventBus
Pchelolo added a comment to T180402: Remove custom ordering from ReadingLists extension.

@Tgr we've never had the need for sorting so we've never settled on a convention. I personally like the /lists/?sort=name more then others.

Tue, Nov 14, 9:14 AM · MW-1.31-release-notes (WMF-deploy-2017-11-28 (1.31.0-wmf.10)), Patch-For-Review, Reading-Infrastructure-Team-Backlog, Reading List Service

Thu, Nov 9

Pchelolo moved T180051: Reduce the number of fields declared in elasticsearch by logstash from Backlog to watching on the Services board.
Thu, Nov 9, 3:08 PM · Patch-For-Review, Services (watching), Operations, Discovery-Search (Current work), Wikimedia-Logstash
Pchelolo added a comment to T180051: Reduce the number of fields declared in elasticsearch by logstash.

In general there are several issues we've observed:

Thu, Nov 9, 2:11 PM · Patch-For-Review, Services (watching), Operations, Discovery-Search (Current work), Wikimedia-Logstash
Pchelolo added a comment to T167433: Switch all projects to the new (and yet to be built) summary-html endpoint for page previews.

As the change will not affect any current users, I think we can go ahead and deploy everywhere.

Thu, Nov 9, 12:38 PM · Readers-Web-Backlog (Tracking), Wikimedia-Site-requests, Page-Previews
Pchelolo added a comment to T180017: Timeouts on event delivery to EventBus.

Seems like we don't specify it in mediawiki-config, so we're using the default of 5 seconds.

Thu, Nov 9, 9:40 AM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Patch-For-Review, Services (next), Analytics, EventBus
Pchelolo added a comment to T180017: Timeouts on event delivery to EventBus.

Judging by the logs (on mw-log1001, not currently available in logstash) the timeouts did not disappear. Maybe we could consider increasing request timeout?

Thu, Nov 9, 8:59 AM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Patch-For-Review, Services (next), Analytics, EventBus
Pchelolo added a comment to T167433: Switch all projects to the new (and yet to be built) summary-html endpoint for page previews.

If we can deploy for all wikis at the same time that would be easier for the Services team but I think @Pchelolo said that that is doable.

Thu, Nov 9, 8:24 AM · Readers-Web-Backlog (Tracking), Wikimedia-Site-requests, Page-Previews

Wed, Nov 8

Pchelolo added a comment to T179421: Migrate revisions and restrictions from legacy to new storage.

Script used to generate the table creation statements:

Wed, Nov 8, 2:03 PM · RESTBase-Cassandra, RESTBase, Services (doing)
Pchelolo created P6288 page_restrictions create table yaml.
Wed, Nov 8, 2:03 PM
Pchelolo created P6287 page_restriction table creation script.
Wed, Nov 8, 2:02 PM
Pchelolo closed T153029: EventBus logs don't show up in logstash as Resolved.

The EventBus logs were fixed and now can be seen in log stash. Resolving.

Wed, Nov 8, 11:13 AM · Services (done), Analytics, Easy, EventBus
Pchelolo created T180017: Timeouts on event delivery to EventBus.
Wed, Nov 8, 11:11 AM · MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), Patch-For-Review, Services (next), Analytics, EventBus
Pchelolo added a comment to T179083: Cassandra schema creation seems unreliable.

Today we've had a semi-outage because of this.

Wed, Nov 8, 10:16 AM · User-Eevans, RESTBase-Cassandra, Services (next)
Pchelolo added a comment to T180005: RESTBASE startup error.

Ok, I know what's happening. You're hitting a bug on our side that I will fix, however you're using the latest master from GitHub which we use for development, so it's unstable and will not work. I'd suggest only using released versions, the last release is https://github.com/wikimedia/restbase/releases/tag/v0.17.0 - that one works perfectly with your config.

Wed, Nov 8, 10:11 AM · Services (done), RESTBase
Pchelolo added a comment to T180005: RESTBASE startup error.

I am not sure which config are you referring to?

Wed, Nov 8, 8:56 AM · Services (done), RESTBase
Pchelolo added a comment to T180005: RESTBASE startup error.

Please provide some more details:

Wed, Nov 8, 8:31 AM · Services (done), RESTBase

Tue, Nov 7

Pchelolo created T179958: Clean up retry-retry Kafka topics.
Tue, Nov 7, 5:29 PM · Services (next), EventBus, Analytics
Pchelolo added a comment to T179420: Migrate definitions storage from the legacy to new strategy.

I guess we can go with https://github.com/wikimedia/restbase/pull/896 right away without creating a proxy. The endpoint is quite low-volume, so it's ok if we just recreate everything.

Tue, Nov 7, 2:50 PM · Services (done), RESTBase-Cassandra, RESTBase
Pchelolo added a comment to T179553: Cookies should not be forwarded to different domains.

In your case, that can be achieved by creating an internal end point (in the /sys/ hierarchy) that calls the MW action API for manipulating lists and then declare the auth filter on that route only.

Thanks, that sounds like a good way to handle it.

Tue, Nov 7, 2:04 PM · Reading-Infrastructure-Team-Backlog, Reading List Service, Services (later), RESTBase
Pchelolo added a comment to T179412: Stop storing feeds in Cassandra.

PR here https://github.com/wikimedia/restbase/pull/908

Tue, Nov 7, 1:29 PM · RESTBase, Services (next)

Mon, Nov 6

Mholloway awarded T172221: Page summary API: Find a sane way to allow clients to select a page image thumb size a Cup of Joe token.
Mon, Nov 6, 10:41 PM · Readers-Web-Backlog (Tracking), Page-Previews, Mobile-Content-Service, Services (designing), Reading-Infrastructure-Team-Backlog

Fri, Nov 3

Pchelolo created T179688: mediawiki-config changes not deployed automatically to deployment-videoscaler01.
Fri, Nov 3, 1:41 PM · Release-Engineering-Team (Kanban), User-greg, Patch-For-Review, Multimedia, Beta-Cluster-Infrastructure, Services (watching)
Pchelolo created T179685: Respawn service-runner workers serially.
Fri, Nov 3, 1:28 PM · Services (later), service-runner
Pchelolo created T179684: ChangeProp workers die if they can't connect to redis.
Fri, Nov 3, 1:24 PM · Patch-For-Review, Services (later), ChangeProp

Thu, Nov 2

Pchelolo created T179579: Cannot read property 'substring' of null.
Thu, Nov 2, 1:22 PM · Patch-For-Review, Services (watching), Parsoid

Wed, Nov 1

Pchelolo added a project to T179113: Endpoints that 404 no longer have the "Access-Control-Allow-Origin" header: Services (watching).
Wed, Nov 1, 5:24 PM · Services (watching), Analytics-Kanban, Pageviews-API

Tue, Oct 31

Pchelolo added a comment to T179417: Migrate Parsoid from legacy to new storage.

We've decided to migrate stashing together with normal parsed tables btw

Tue, Oct 31, 4:53 PM · RESTBase-Cassandra, RESTBase, Services (doing)
Pchelolo created T179412: Stop storing feeds in Cassandra.
Tue, Oct 31, 4:21 PM · RESTBase, Services (next)
Pchelolo added a comment to T174993: Vandalism in "In the news" articles persisting in the app' ?.

It does give us a pretty significant latency improvement for Varnish cache misses, from 500 ms generating it each time to 230 ms using Cassandra. However, Varnish hit rate is quite high on this particular endpoint: out of 28 req/s that hit Varnishes only about 1.5 req/s is a cache miss.

Tue, Oct 31, 4:19 PM · Reading-Infrastructure-Team-Backlog, Services (watching), Mobile, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, iOS-app-Bugs, Android-app-Bugs

Mon, Oct 30

Pchelolo added a comment to T178983: Malformed HTTP message in EventBus logs.

logging this from MediaWiki (even if just the meta) is more appropriate.

Mon, Oct 30, 3:50 PM · Analytics-Kanban, Services (next), EventBus
Pchelolo renamed T179280: PHP out of memory error trying to log big events from PHP put of memory error trying to log big events to PHP out of memory error trying to log big events.
Mon, Oct 30, 12:58 PM · MW-1.31-release-notes (WMF-deploy-2017-10-17 (1.31.0-wmf.4)), Services (done), EventBus, Analytics
Pchelolo added a comment to T178983: Malformed HTTP message in EventBus logs.

@Ottomata pushing > 100 megs into log stash? Don't think that's a good idea.

Mon, Oct 30, 12:30 PM · Analytics-Kanban, Services (next), EventBus
Pchelolo created T179280: PHP out of memory error trying to log big events.
Mon, Oct 30, 12:30 PM · MW-1.31-release-notes (WMF-deploy-2017-10-17 (1.31.0-wmf.4)), Services (done), EventBus, Analytics
Pchelolo created T179270: TTMServerMessageUpdateJob fails in labs.
Mon, Oct 30, 10:36 AM · Patch-For-Review, User-Nikerabbit, Wikimedia-Site-requests

Fri, Oct 27

Pchelolo reopened T178997: AddUsagesForPageJob doesn't really report execution status as "Open".

That gerrit was a workaround, the real issue is that the jobs don't follow the Job contract. Reopening.

Fri, Oct 27, 12:24 PM · EventBus, Analytics, MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Wikibase-Quality-Constraints, Need-volunteer, Services (watching), Wikidata
Pchelolo added a comment to T174993: Vandalism in "In the news" articles persisting in the app' ?.

We've discussed this issue during the Services-Reading meeting and here're some ideas from the discussion:

Fri, Oct 27, 9:53 AM · Reading-Infrastructure-Team-Backlog, Services (watching), Mobile, Wikipedia-iOS-App-Backlog, Wikipedia-Android-App-Backlog, iOS-app-Bugs, Android-app-Bugs
Pchelolo added a comment to T178983: Malformed HTTP message in EventBus logs.

So, we have an event that's > 100 Mb of serialized JSON... Wonderful. Do you think it's possibly to dump it somewhere on the filesystem to find out what is that? It's probably creating some issues in the current JobQueue and MediaWiki as well, it's even more abnormal then previously found 17 Mb events. Perhaps temporary increase the max_buffer_size to something like 200 Mb so that tornado can at least accept the event and log it?

Fri, Oct 27, 7:55 AM · Analytics-Kanban, Services (next), EventBus
Pchelolo renamed T179064: ORES internal server error for edit with many added links from ORES eternal server error for edit with many added links to ORES internal server error for edit with many added links.
Fri, Oct 27, 7:39 AM · Scoring-platform-team (Current), Services (watching), ORES

Thu, Oct 26

Pchelolo added a comment to T179058: RB and CP logs disappeared from Logstash.

One more bit: I've tried very hard to locate any log entries that would have different actual types for the problematic fields and did not succeed. This might suggest that the order of messages is not the reason, but maybe I've just missed a couple of sneaky records

Thu, Oct 26, 11:55 AM · Operations, Patch-For-Review, ChangeProp, RESTBase, Services (watching), Wikimedia-Logstash
Pchelolo created T179064: ORES internal server error for edit with many added links.
Thu, Oct 26, 11:12 AM · Scoring-platform-team (Current), Services (watching), ORES
Pchelolo added a comment to T179058: RB and CP logs disappeared from Logstash.

I've found that particular log entry from 2017-10-26T06:17:47 in parsoid logs locally on the machine and nothing is unusual, it's just a normal Parsoid log entry.

Thu, Oct 26, 10:07 AM · Operations, Patch-For-Review, ChangeProp, RESTBase, Services (watching), Wikimedia-Logstash

Wed, Oct 25

Pchelolo added a comment to T178997: AddUsagesForPageJob doesn't really report execution status.

Same happens with UpdateConstraintsTableJob

Wed, Oct 25, 4:54 PM · EventBus, Analytics, MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Wikibase-Quality-Constraints, Need-volunteer, Services (watching), Wikidata

Oct 25 2017

Pchelolo added a project to T178997: AddUsagesForPageJob doesn't really report execution status: Services (watching).
Oct 25 2017, 12:38 PM · EventBus, Analytics, MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Wikibase-Quality-Constraints, Need-volunteer, Services (watching), Wikidata
Pchelolo created T178997: AddUsagesForPageJob doesn't really report execution status.
Oct 25 2017, 12:37 PM · EventBus, Analytics, MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), Wikibase-Quality-Constraints, Need-volunteer, Services (watching), Wikidata
Pchelolo added a project to T178983: Malformed HTTP message in EventBus logs: Services (next).
Oct 25 2017, 10:35 AM · Analytics-Kanban, Services (next), EventBus
Pchelolo created T178983: Malformed HTTP message in EventBus logs.
Oct 25 2017, 10:35 AM · Analytics-Kanban, Services (next), EventBus

Oct 17 2017

Pchelolo added a comment to T176627: Trial replacing Electron with headless Chromium in the render service.

Thanks, I'll submit fixes soon. Btw, we've moved to gerrit and the new repo lives at https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/chromium-render. I'll ping you once the patch is ready.

Oct 17 2017, 12:39 PM · Services (watching), Patch-For-Review, Readers-Web-Kanban-Board, Readers-Web-Backlog, Proton, Electron-PDFs
Pchelolo added a comment to T176627: Trial replacing Electron with headless Chromium in the render service.

Left some comments regarding the code: https://github.com/kodchi/mediawiki-services-chromium-render/commit/d02e5a57ec1e986f0992edaf6b8c8169d13b5203

Oct 17 2017, 11:22 AM · Services (watching), Patch-For-Review, Readers-Web-Kanban-Board, Readers-Web-Backlog, Proton, Electron-PDFs
Pchelolo created T178362: Drop feed keyspaces from Cassandra 2.
Oct 17 2017, 10:29 AM · RESTBase, Cassandra, Services (done)
Pchelolo added a comment to T158100: Deprecate and remove the public title/{title} endpoint.

The public /revision hierarchy was removed as well as secondary indexing usage in Cassandra. deleting the secondary indexes does not really delete the tables, so the cleanup should be done manually. Here's the list of tables that should be removed:

Oct 17 2017, 10:25 AM · Services (done), RESTBase-API, RESTBase
Pchelolo added a comment to T178078: RESTBase logs disappeared from logstash.

The logs are back where they belong, so I guess the ticket can be resolved. Thank you @fgiunchedi

Oct 17 2017, 10:11 AM · Patch-For-Review, Traffic, Operations, Wikimedia-Logstash, Services (watching)
Pchelolo added a project to T178037: Parsoid uses non-canonical URL encoding: Parsoid.

: and $ are reserved charcters so URLs which differ in how these characters are encoded are not considered equal. The web server will consider them equivalent so this is not a big deal in practice, but it will split Varnish and other caches, and maybe confuse semantic web applications.

Oct 17 2017, 10:00 AM · Parsoid, Services (later), RESTBase

Oct 12 2017

Pchelolo created T178078: RESTBase logs disappeared from logstash.
Oct 12 2017, 3:17 PM · Patch-For-Review, Traffic, Operations, Wikimedia-Logstash, Services (watching)
Pchelolo added a comment to T176689: Cannot use $ref in parameter objects in the Swagger spec.

Fixed by https://github.com/wikimedia/hyperswitch/pull/84

Oct 12 2017, 2:13 PM · Services (done), HyperSwitch
Pchelolo added a comment to T176627: Trial replacing Electron with headless Chromium in the render service.

Is there a cheaper way to find out? I've jokingly suggested renting a hefty VPS for 30 minutes but is that an option? /cc @GWicke @mobrovac @Pchelolo

Oct 12 2017, 2:01 PM · Services (watching), Patch-For-Review, Readers-Web-Kanban-Board, Readers-Web-Backlog, Proton, Electron-PDFs
Pchelolo claimed T176689: Cannot use $ref in parameter objects in the Swagger spec.

Indeed we need to register all the definitions in ajv when doing the validations.

Oct 12 2017, 10:34 AM · Services (done), HyperSwitch

Oct 11 2017

Pchelolo added a comment to T173639: Hovercard text extract is broken for `* ` sequence in parenthesis.

Oh, ok, didn't understand that. It's completely doable, will make a PR later today

Oct 11 2017, 4:02 AM · Readers-Web-Backlog (Tracking), TextExtracts, Page-Previews
Pchelolo added a comment to T173639: Hovercard text extract is broken for `* ` sequence in parenthesis.

@bearND unfortunately we can't do that in RESTBase layer. The bug is in TextExtracts extension itself - for the request we're making to it from RB we only get content up to the asterix: https://cs.wikipedia.org/w/api.php?action=query&prop=extracts&exintro=true&exsentences=5&titles=Marek_Eben

Oct 11 2017, 3:52 AM · Readers-Web-Backlog (Tracking), TextExtracts, Page-Previews