Page MenuHomePhabricator

wm-bot logs broken
Closed, ResolvedPublic

Description

Trying to access the wm-bot IRC logs results in the following message:

Could not connect:

(That’s the entire response content. See P7751 for the full response including headers.)

Event Timeline

this is caused by some wmf labs related issue, it seems that instance huggle-pg was stuck for a very long time, and now it was unstuck, but there are still issues getting wm-bot to connect there.

Petrb renamed this task from wm-bot log browser broken “could not connect” to wm-bot logs broken.Nov 2 2018, 11:36 AM
Petrb reopened this task as Open.
Petrb triaged this task as High priority.

this is caused by some weird issue when connecting to postgre server, while it's possible to connect just fine using psql client, wm-bot's npgsql library fails with:

Description: No route to host
Stack trace: at Npgsql.NpgsqlClosedState.Open (Npgsql.NpgsqlConnector context) [0x00000] in <filename unknown>:0

at Npgsql.NpgsqlConnector.Open () [0x00000] in <filename unknown>:0
at Npgsql.NpgsqlConnectorPool.GetPooledConnector (Npgsql.NpgsqlConnection Connection) [0x00000] in <filename unknown>:0

it would help if someone from wmf labs staff explained what exactly happened with huggle-pg instance, I believe its IP changed and I also believe that some of the software on instance changed, possibly was updated by someone?

huggle-pg got moved from the main/eqiad region to eqiad1-r. Its IP is now 172.16.2.31, it would previously have been in the 10/8 range. These moves have been announced including in emails from @Andrew to the huggle project administrators dated 19th and 31st of October. I imagine there is a security group rule somewhere that needs updating. Why is an instance in the wm-bot project relying on an instance in the huggle project?

Andrew wrote:

There were some irregularities with the huggle-pg.huggle.eqiad.wmflabs
instance -- in particular, I had to substitute in a slightly-different
COW base image. So I'm keeping the old VM around for a while as a
backup -- you'll see it as SHUTDOWN in horizon in the 'eqiad' region but
if all goes well you can just ignore it.

(And, in any case, this is a Trusty VM so hopefully you're already on
track to destroying and rebuilding it with Stretch. Let me know if you
need a link to the task and explanation of all that.)

No there isn't any problem with a security rule, as I said I can connect to it from wm-bot2 instance using psql (postgre's CLI) just fine, it's wm-bot's npgsql library that isn't able to connect there, probably some kind of a bug in the library itself.

The reason why we use huggle's pg SQL was historically to save resources on labs. This database already existed and was large enough to host wm-bot's logs so I decided to use it instead of spinning up a whole new instance within wm-bot project. I would actually prefer to use some shared postgre service, but there is none, so running own postgre SQL servers is only solution for now.

I am wondering what does the change of COW base image actually means for us?

If labs have resources for this, I could probably create a new postgre SQL server on top of supported OS, but it means at least 20GB of storage, right now wm-bot's IRC logs in SQL have over 12GB

No there isn't any problem with a security rule, as I said I can connect to it from wm-bot2 instance using psql (postgre's CLI) just fine, it's wm-bot's npgsql library that isn't able to connect there, probably some kind of a bug in the library itself.

Huh, true, that is weird. I'd have a go with it but I don't have access to the wm-bot project.

The reason why we use huggle's pg SQL was historically to save resources on labs. This database already existed and was large enough to host wm-bot's logs so I decided to use it instead of spinning up a whole new instance within wm-bot project. I would actually prefer to use some shared postgre service, but there is none, so running own postgre SQL servers is only solution for now.

okay. what is wm-bot-pg.wm-bot.eqiad.wmflabs?

okay. what is wm-bot-pg.wm-bot.eqiad.wmflabs?

good question