Page MenuHomePhabricator

[Commons Impact Metrics] Add test data in AQS's test environments to back up new AQS service
Open, MediumPublic8 Estimated Story Points

Description

Once the API design and Cassandra/Druid datasources are specified T358679,
and the base pipeline is ready T358699,
we can populate the corresponding AQS test environment with test data,
so that the new AQS service can run its integration tests.

Tasks:

  • Grab queries that extract the Commons Impact Metrics data from Iceberg and format them into the expected Cassandra/Druid format from T358707.
  • Add the queries and the generated data to the corresponding AQS test env repository.
  • Add the remaining necessary schema and script code to the repo.
  • Test that the data loads correctly when starting up and bootstraping the test env.
  • Code-review and merge.

Definition of done:

  • The test env starts up and bootstraps correctly
  • The test env contains all necessary test data for the new service to run integration tests

Event Timeline

mforns set the point value for this task to 8.

Waiting on decision on Cassandra Gateway or keeping existing process of pulling data directly from the DB.

TODO for Cassandra Gateway

  • DP needs to implement our endpoints (4-5 lines of code per endpoint) (wait time currently: 1 month)
  • Significant refactoring of AQS required (14 endpoints, changes to ~2 layers: logic & data, plus unit tests)

Benefits: staging environment, shorter/cleaner code, it exists and should be used because it is a better way to retrieve data

Decision: Let's advance deliver as is and plan for this work as future tech debt pay down in sync with Data Persistence's timeline.