Page MenuHomePhabricator

Report on the initial growthbook installation PoC
Closed, ResolvedPublic

Description

Now that we have a Growthbook instance running on k8s, we should make sure that we track what we've learned during the experiment. In particular:

  • Need to introduce yet another storage system (MongoDB)
  • Further needs to package Growthbook in a standard way before going to production
  • Monitoring / alerting of the various components
  • Further refinements to the MongoDB deployment, backups, redundancy, ...
  • Definition of SLOs
  • ...

Event Timeline

We have installed Growthbook in the dse-k8s-eqiad Kubernetes cluster. As we're only experimenting with it at the moment, we aimed for speed of execution more than reliable deployment. As such, we have cut some corners, and neglected some tasks that we usually address when we're deploying a production-ready system.

We have also learned enough about Growthbook itself to be able to discern potential sources of friction for the future if we run it in production.

MongoDB

First off, Growthbook requires a mongoDB instance to run. This is an issue for us for several reasons:

  • we lack the SRE experience to reliably manage a mongoDB cluster. We'd need to ramp up on a new datastore, figure out how to manage backups, etc, from scratch.
  • right now, we have deployed a single mongo server. While its data is persisted in Ceph, we're not running any kind of HA cluster. Deploying such a cluster in Kubernetes (should we want the provided reliability guarantees) will take some design and getting it wrong before getting it right.
  • we could use the community operator to deploy a cluster to Kubernetes, however, backups are part of the Enterprise release
  • the SSPL licensing might also be a source of contention and disagreement within the Foundation, as were Growthbook and Statsig themselves. To circumvent this issue, we currently shipped mongoDB in the growthbook Docker image, that we installed from a tarball. This way, we didn't have to "taint" our apt repo with some non-OSI software.
  • as a more general point, it forces us to introduce yet-another storage system. We have tried to circumvent this issue by relying on FerretDB, which exposes a mongoDB <---> PosgresSQL transpilation layer. We've however ran into issues that we couldn't explain, forcing us to deploy mongoDB itself. We're not keen in maintaining mongoDB as a "snowflake" system if we can avoid it. The reliance on mongo itself is however baked into the Growthbook source.

Monitoring and alerting

  • We currently don't have either metrics, dashboard nor alerting in place.
  • A cursory search didn't yield anything in terms of prometheus metrics exposed by Growthbook, so any monitoring might have to be done in a very synthetic fashion (is the pod running? this sort of thing)
  • We'd need to define an availability SLO (after having figured out the level of service we'd want to guarantee)

Misc

  • As an aside, Growthbook itself was not particularly difficult to deploy. We don't expect the chart maintenance itself to be overwhelming. Right now, we have embedded the mongoDB Deployment within the growthbook-backend subchart, which we might want to untangle at some point.
  • We don't currently have any service documentation page, not do we have cookbooks or documented operations for Growthbook nor mongoDB.

Note

We could reach out to the FerretDB development team, in the hope that we're able to understand the issue we faced during our PoC, and fix it, if we feel that using PostgreSQL as a backend might a sounder investment. We could also modify Growthbook itself to support PG as a backend. However, due to the different nature of both datastores (document vs relational), the Growthbook design itself might be influenced by the document-based properties of mongo. We could possibly rely on jsonb fields to store documents in PostgreSQL, as well as JSONB indexing to re-implement Growthbook-over-PG.

A rapid search in the growthbook issue tracker shows no user-interest in supporting another database.

brouberol claimed this task.