Page MenuHomePhabricator

Development and test environments for AQS 2.0 Cassandra services
Closed, ResolvedPublic

Description

Ideally we'd have hosted development and staging environments running Cassandra with real (or mock) data, and/or be able to easily spin them up (for example, during CI). However, at a minimum, some tooling (Docker Compose?) that can be used to locally spin up an environment, and load it with sample data, may be necessary to facilitate the AQS 2.0 project.


See also:
https://gerrit.wikimedia.org/r/679295 (a work-in-progress Docker Compose environment for AQS)
https://gitlab.wikimedia.org/eevans/aqs (an alternative work-in-progress Docker Compose environment for AQS)

Details

TitleReferenceAuthorSource BranchDest Branch
minor build fixesclarakosi/aqs!1clarakosibuildsmain
minor build fixesrepos/generated-data-platform/aqs/pageviews!2eevansbuildsmain
Customize query in GitLab

Related Objects

Event Timeline

Eevans triaged this task as Medium priority.Aug 4 2021, 7:20 PM
Eevans created this task.

There is some basic work on a docker-compose setup for AQS + cassandra + test data in its current state from another task here https://gerrit.wikimedia.org/r/c/analytics/aqs/+/679295

There is some basic work on a docker-compose setup for AQS + cassandra + test data in its current state from another task here https://gerrit.wikimedia.org/r/c/analytics/aqs/+/679295

Thanks!

I took a look at this with a mind toward extending it for our use, but I think I concluded otherwise. What's in r679295 starts an instance of Cassandra, and a container to run the legacy AQS service. This old version of RESTBase initializes the schema, and I assume the test data you're referring to is what the tests create.

What we need here is Cassandra (+ schema + some test data) and Druid (+ some test data), and (TTBMK) we will not need the legacy service running for anything. That leaves an intersection of mostly "Cassandra".

Additionally, it probably does not make sense for AQS 2.0 to reuse the existing repo since it isn't likely they'll share any usable history (especially if we implement using Golang, as has been discussed). The legacy repository would benefit from having r679295 in the interim though.

I stubbed out https://gitlab.wikimedia.org/eevans/aqs as a jumping off point (based on @hnowlan's gerrit).

Thoughts?

As an experiment: I used a query (ala cqlsh) on the production Cassandra cluster, plus some shell hackery, to create a COPY-compatible csv file and some import test data (see d6399bd). This gives us real data -data we could use (for example) to compare API results with those from production- and it's fairly straightforward to obtain.

This comment was removed by Eevans.
BPirkle renamed this task from Develepment and test environment for AQS to Development and test environment for AQS.Jun 23 2022, 10:51 PM
BPirkle renamed this task from Development and test environment for AQS to Development and test environments for AQS.Sep 13 2022, 2:27 AM
JArguello-WMF renamed this task from Development and test environments for AQS to [Needs updating according to new conventions] Development and test environments for AQS.Apr 11 2023, 6:27 PM
FJoseph-WMF renamed this task from [Needs updating according to new conventions] Development and test environments for AQS to Development and test environments for AQS 2.0 Cassandra services.May 10 2023, 1:38 PM
FJoseph-WMF updated the task description. (Show Details)