Page MenuHomePhabricator

Move wm-bot instance to Trusty
Closed, ResolvedPublic

Description

With regard to T143349: Deprecate precise instances in Labs by 2017-03-31, wm-bot will be removed by the end of March. We need to move wm-bot's instance to Trusty.

Event Timeline

OK, can we maybe create a new project for wm-bot so that we can finally nuke "bots" project? I know that some of you folks will again suggest to move it to tool labs, but that environment is just not flexible enough. It means massive refactoring of bot and also removal of many useful features.

So if possible, pls create a new project for wm-bot and we can create new server there.

@Andrew is it possible to request some general purpose storage? We used to have /data/project for this in past, but this new project doesn't have this mount point and instances are created incredibly small. We need somewhere to store channel logs and Postgres

You can enable role::labs::lvm::srv on your instance and force a puppet run via sudo -i puppet agent --test --verbose. This will create a partition that fills the remainder of your instance's disk quota and mount it at /srv on the instance.

If that's not enough storage, we can help you attach to NFS as well which is what the old /data/project mounts were. Getting local instance disk beyond the default VM quotas requires @Andrew to create custom image sizes for you. That is possible, but we would need you to tell us how much storage you need and to make sure we have a labvert that is capable of providing that much space locally.

OK I think that for now it should be enough, maybe in the end, we don't need more than 20gb for these logs, I am just not sure about postgres.

The current real problem is that I can't get web proxy to work, I created in horizon: wm-bot2.wmflabs.org but it doesn't even resolve.

[…]
The current real problem is that I can't get web proxy to work, I created in horizon: wm-bot2.wmflabs.org but it doesn't even resolve.

Negative DNS caching? Works for me:

[tim@passepartout ~]$ host wm-bot2.wmflabs.org
wm-bot2.wmflabs.org has address 208.80.155.156
[tim@passepartout ~]$

[…]
The current real problem is that I can't get web proxy to work, I created in horizon: wm-bot2.wmflabs.org but it doesn't even resolve.

Negative DNS caching? Works for me:

[tim@passepartout ~]$ host wm-bot2.wmflabs.org
wm-bot2.wmflabs.org has address 208.80.155.156
[tim@passepartout ~]$

It started resolving but still doesn't work, now I get 504 Gateway Time-out

@Petrb hi, what port are you using with the web proxy and have you enabled the port through the firewall?

Also why not move to debian jessie?

@Paladox I used image recommended by @Andrew you were right it was firewall blocking it

Ok thanks. Did enabling it in the firewall work?

Yes, bot is now up and running on Jessie, I need to test if web hooks work, which is now most important thing, then we need to migrate logs and SQL DB.

Please don't terminate old wm-bot instance yet, there may be something I forgot to copy from it.

There is one problem I need to run identd which requires public IP address, can we get 1 for wm-bot project? Previous bot also had it.

This comment was removed by Andrew.

Granted, as per T158520. lmk if you have trouble assigning it -- the horizon interface should be straightforward.

Is this the reason for "The requested URL /browser/index.php was not found on this server." at http://wm-bot.wmflabs.org/browser/index.php?display=%23wikimedia-collaboration , or should I file a separate task?

I think this is expected but in case it's not, no new content for #wikimedia-mobile has been logged for 10+ days: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-mobile/

Hello, yes this is related to both issues, but wm-bot is still logging all channels, just the data are now stored somewhere else, I need to move the web services and merge the log files, should be done soonish

Hi, browser is back, also please don't use old "bots.wmflabs.org" domain, it was deprecated some time ago, correct one is wm-bot.wmflabs.org

I can't guarantee there will be no more outages, 2 more instances need to be reinstalled

Thanks, I've asked for someone with channel permissions to update the URL to: https://wm-bot.wmflabs.org/logs/%23wikimedia-mobile/

Are there still pending tasks here, or is this resolved?

Hello,

wm-bot is depending on huggle-pg instance which was not yet migrated, it's not so easy, it's a postgres database, and it's pretty huge. Hopefully I will be able to resolve this this week, I am extremely busy these days

I mean, if huggle-pg is down, wm-bot itself will be still operational, but we lose access to SQL based IRC logs. We probably don't want that, the database is kinda useful and valuable, problem is that migrating to another server is complicated, it's a live SQL database to which new data are added every second, any outage even a second long is a problem here.

We can't reinstall, nor migrate it completely online, postgres isn't so advanced technology to do that, but still, I would like to have as small gap in IRC logs as possible, so whichever approach we take, it should that which takes smallest amount of time possible.

I would almost prefer doing a simple dist-upgrade, problem is that I already tried this on huggle instance, and ssh session timed out in progress and now I can't login back to instance because LDAP somehow broke on it. It's working, the service it provides is up and running but I can't ssh back to finish upgrade. I would rather avoid this problem with postgres instance

@Petrb what about migrating a postgress db to a mysql one? Will that work? mysql is advanced enough to support online migration as you won't need to take the db down.

I am not aware of any such feature and highly doubt it.

We got rid of mysql which was originally used, because of lack of many features that enterprise-grade rdbms like postgre or oracle provide, moving back to MySQL would be a massive step back.

also I don't see how you could migrate from postgres to mysql online, that also isn't possible, so this definitely would not work.

probably most easy way to do this would be to switch wm-bot to newly created DB on wm-bot-pg and then start migration of existing data, I will have a look into this later, but I am too busy now. I am not sure if this won't break the sequences though

The easiest and safest solution would probably be using something like Bucardo (live migration tutorial).

Continuous archiving or hot standby would probably be possibly be the best solution if you want/need a solution in PostgreSQL itself.

MariaDB is just as bad as MySQL,

anyway I've decided to take the approach of live update, it's most easy atm

I just nuked wm-bot instance, in project bots. There is one more instance "botbot" that I will look in, maybe there is something I would like to archive for future, and then we can probably nuke whole "bots" project, which is by the way, probably second oldest project of wikimedia labs :)

btw botbot is 14.04 so it doesn't block stuff