Page MenuHomePhabricator

Enable maps beta service
Closed, DeclinedPublic

Description

In order to test new data schemas, and otherwise have proper pre-production test environment, I propose we use some of the maps-test servers for setting up a beta service.

It will be publically accessible, e.g. via maps.wmflabs.org, and WMF beta cluster will use it (either in addition to, or instead of) the production maps.

While kartotherian could in theory run in VM, the Postgres and Cassandra DB must reside on the real hardware.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

Labs has an OSM DB however it was imported without hstore, if I remember correctly. Also, Cassandra can probably run on a VM cause up to 160G / partitions are supported while a single fully rendered layer is <70G.

Correct, but what I would like is an environment that has a full, self-updating DB just like production, plus we can easily run some experiments on it. I don't think virtualized storage will work well, either postgres or cassandra - seems that VM kills all the IO

Do we need the full data set for experimenting? We could probably only import a small area (osm2pgsql --bbox). That would obviously reduce the kind of experiment we can do.

This is an honest question, I'm not trying to push us to use VMs. I just want to make sure this decision is explicit.

Labs has an OSM DB however it was imported without hstore, if I remember correctly.

The OSM DB in labs has been imported WITH hstore.

Also, Cassandra can probably run on a VM cause up to 160G / partitions are supported while a single fully rendered layer is <70G.

Indeed.

Correct, but what I would like is an environment that has a full, self-updating DB just like production,

We got that in labs as alredy pointed above.

plus we can easily run some experiments on it.

That might or might not be so easy. With some more details it may be possible to flesh it out better.

I don't think virtualized storage will work well, either postgres or cassandra - seems that VM kills all the IO

What makes you think that?

@akosiaris I think we are talking about two totally different targets here. The OSM DB in wmflabs is set up for wmflabs users to host various community maps projects. This is a wonderful goal, but different from our needs. Moreover, we really should treat that service as "production" because users rely on it, which limits what we can experiment with.
Interactive team needs to have a simplified clone of production (2 backend servers is enough), where we can experiment with different configurations, db structures, alternative tile storage strategies, performance measuring, and other tasks. About a year ago Discovery dept was forced to rent a server outside of WMF, at a significant cost, just so we can move forward on maps. I hope we can set up an experimentation platform with the real hardware within WMF infrastructure.

Talking in IRC with @Gehel we 've come to the conclusion that baremetal labs is quite probably the best way forward for this. This should allow both the requirements you have of physical hosts to be fullfilled, as well as the requirement of testing/development infrastructure to not be in production space/realm.

@Yurik we probably don't all have the same definition of production here. This future labs-beta service will have some expectation of availability (obviously), but probably not to the same level as other production services. It will most probably have much lower expectation in term of stability (change of the contract of services) as this is the main goal of being able to experiment. And given those lower level of expectation, we can probably give you more access to those servers ("more" needs to be defined).

@yuvipanda, @chasemp: labs is your area of expertise, so your feedback is welcomed. It seems to me that this is a similar situation than the Cirrus relevance forge servers (T131184), but that we now have a proper place to put baremetal servers in labs.

As we have old maps-test servers in codfw, with the appropriate SSD already installed, it might make sense to keep those servers. I have no idea if we are ready to host labs servers in codfw.

I don't know that I understand what is meant by baremetal in labs here. We don't do what is being discussed here I think. Here is an outline I made to help with these conversations.

The closest approximation we have would be (to quote the description)

... kartotherian could in theory run in VM, the Postgres and Cassandra DB must reside on the real hardware.

This with the DB's on physical hardware in labs-support managed as production hosts.

We have historically made sure that labs-support allocations are beneficial for labs in general, for example ensuring the elasticsearch cluster there is meant to be queried by all of labs. In this case if we are talking deployment-prep I imagine it could make sense though.

I don't know honestly but I am wondering about the assertion the Postgres and Cassandra DB must reside on the real hardware.. Where does the reasoning stem from? How do we know?


As we have old maps-test servers in codfw, with the appropriate SSD already installed, it might make sense to keep those servers. I have no idea if we are ready to host labs servers in codfw.

This would be fairly novel and probably not ideal. We do not have a labs presence in codfw other than some test setup and offsite backups.

We can have a beta cluster with only a subset of the production data (for example, a single country, or a few square kilometre of land). Having a small dataset should allow to test most of what needs testing and keep the performance requirement aligned with what is possible on labs.

Additional requirements we should put on this test environment:

  • we can break it whenever we want
  • anyone in the team can experiment on all aspects of the project
  • it is based as much as possible on the same code as production, for application code and for automation code (puppet)

I don't know honestly but I am wondering about the assertion the Postgres and Cassandra DB must reside on the real hardware.. Where does the reasoning stem from? How do we know?

PostgreSQL works better on bare metal, both in terms of performance, and the VM not screwing up data consistency. It also tends to be easy to keep it on bare metal, as it's well established software and the only access needed is through port 5432

I don't know honestly but I am wondering about the assertion the Postgres and Cassandra DB must reside on the real hardware.. Where does the reasoning stem from? How do we know?

PostgreSQL works better on bare metal, both in terms of performance, and the VM not screwing up data consistency. It also tends to be easy to keep it on bare metal, as it's well established software and the only access needed is through port 5432

One of the constraint we have here is that our lab is purely on VM at this point. As we don't really need performance for a test environment and that data consistency is not a critical requirement, we *should* be fine on VMs. We should at least create this environment and see if it does respond to our needs. If it does not, we'll find another way...

Given current status of the maps project.