Page MenuHomePhabricator

502 /504 Bad Gateway issue on Petscan
Closed, InvalidPublic


See instead

Working Petscans frequently return a 502 Bad Gateway error.

Examples I use regularly, and which frequently enough return a 502, are:

From the user pov, there doesn't seem to be any rhyme or reason. Sometime the petscan works. Sometimes a 502 is returned.

  1. What is the cause of these 502s?
  2. Can the underlying problem be fixed, please?
  3. If the problem cannot be fixed, can we have a more informative error message - e.g. if there's a query timeout, could we be informed of this?

Event Timeline

Hi, please use thee "Issues" link on top of to report issues as Petscan does not track its issues in Wikimedia Phabricator. Thanks.

So far as I know, the fault is not with Petscan, but with the WMF infrastructure on which it operates. Please do not be so hasty to dismiss problems like this.

Ideally Cloud-Services will liaise with @Magnus either to bottom out & solve the issue; or else to provide an explanation for the 502 issue which we can use to console ourselves each time it happens.

zhuyifei1999 added a subscriber: zhuyifei1999.

As far as I know, if loads, there is nothing going wrong with the WMCS networking & routing, which would normally cause 502s.

If nothing is wrong with networking / routing, 502 means the application itself is taking way too long to respond, and you should get the tool maintainer to debug their application and see what is it spending the time on. You can reopen this task if the time is all wasted on WMCS infrastructure.

How about we work out what's going on BEFORE we peremptorally close this issue. "As far as I know" does not cut it.

OK, so what appears to happen is that SQL queries timeout and take PetScan with them. Note:

  • I wrote some code that re-arranges certain large queries into smaller ones, which cuts down on the timeouts; that code has been live for weeks
  • That works fine on the dev machine but not reliably on the production machine
  • The dev machine has less resources than production, but is otherwise identical (OS etc)

As this does not fail reproducibly, it's either some odd bug in my code, or some situation on the DB replicas.

Also, I just clicked on the two examples. They took 133 and 173 seconds, and returned 50 and 1 results, respectively. No 502s, though I have seen those occasionally.

Thanks Magnus. I run those two daily; get perhaps 20% 502s, without an obvious pattern. There are others (e.g. listed on ) which more reliably 502

I think it's for you to decide whether to hold this ticket open or close it.

Testing on dev now. Looks like frequent Lost connection to MySQL server during query errors from the DB replicas.

Sorry to burden you with this. As a user 502s are v.frustrating, but I can also imagine it's v.frustrating for you to be asked to debug when you could be making snowmen instead.

Mixed results. The following three worked like a charm:

*Articles with no wikidata item
:*to 5 levels of depth - [ auto-run]
:*to 6 levels of depth - [ auto-run]
:*to 7 levels of depth - [ auto-run]

But the next three all 502d

*Articles, with wikidata items specifying 'human' but with no gender
:*to 5 levels depth - [ auto-run]
:*to 6 levels depth - [ auto-run]
:*to 7 levels depth - [ auto-run]

Restricted Application added a subscriber: alaa. · View Herald TranscriptAug 6 2019, 8:15 PM

There are some PetScan outages, reason unknown so far, but outside those, the above queries all work fine.

I did/do not see any successful Petscan query for days. Always getting a: 504 Gateway Time-out.

Aklapper closed this task as Invalid.EditedApr 29 2020, 12:07 PM

This seems to be tracked in , as Magnus uses to track issues.
Hence closing as invalid here as this task is tagged as VPS-Projects.

If this issue has been investigated and if this issue is/was a problem with the Cloud-VPS infrastructure itself, please reopen and tag it as Cloud-VPS.

@Aklapper actually the repo is (the bitbucket one is the old C++ version).

Aklapper renamed this task from 502 Bad Gateway issue on Petscan to 502 /504 Bad Gateway issue on Petscan.May 1 2020, 9:17 AM

For smaller queries, I think I don't know about it in detail.

It's my dev server, should work for any query size. Feel free to use when the main site is down.