Page MenuHomePhabricator

AQS 2.0: Page Analytics: add testing data for legacy endpoint to Cassandra testing env
Closed, ResolvedPublic

Description

Per T327931: AQS 2.0: Page Analytics: implement legacy data endpoint, we need to implement the "legacy/pagecounts" endpoint within the Page Analytics service.

This endpoint, in production, will hit the Legacy Pagecounts dataset. We therefore need representative data from that dataset extracted and available in our Cassandra testing environment to develop and run tests against.

As part of this, document the procedure used to extract the data and prepare it for the testing env.

Completion criteria:

  • necessary data is identified
  • necessary data is extracted from production
  • necessary data is added to the Cassandra testing env
  • all steps are documented on the AQS 2.0 site

Anticipated QA needs:

  • no existing tests that execute against the Cassandra testing env break (no regressions)
  • new data loads into the testing env (mostly just check for no errors, as the endpoint won't yet existing to actually try the data)

Event Timeline

Might be a good task for @FGoodwin since she's familiar with the magic here

@FGoodwin @BPirkle Just to confirm, endpoints like this is valid to go ahead testing this endpoint : https://wikimedia.org/api/rest_v1/metrics/legacy/pagecounts/aggregate/en.wikipedia.org/all-sites/monthly/2016050101/2016081501 correct ?

@FGoodwin @BPirkle Just to confirm, endpoints like this is valid to go ahead testing this endpoint : https://wikimedia.org/api/rest_v1/metrics/legacy/pagecounts/aggregate/en.wikipedia.org/all-sites/monthly/2016050101/2016081501 correct ?

Not yet.

The endpoint you mentioned is exposed by the production AQS 1.0 service, and is not related to AQS 2.0. We will be adding a corresponding endpoint to the AQS 2.0 Page Analytics service under T327931: AQS 2.0: Page Analytics: implement legacy data endpoint, but that has not occurred yet. When it does, you'll need to test it locally against the mock dataset (aka testing environment) using a url like http://localhost:8087/metrics/legacy/pagecounts/aggregate/en.wikipedia.org/all-sites/monthly/2016050101/2016081501

But that isn't done yet either - for now, for this current task, T332172: AQS 2.0: Page Analytics: add testing data for legacy endpoint to Cassandra testing env, all we are doing is adding the data to the mock dataset (aka testing environment) so that the team can create the new "legacy" endpoint. The reason is that they need data to test against while they are creating the endpoint, and for the unit/integration tests to execute against.

That means there's not a lot for QA to do on this task. Really, all you can do is:

  1. pull down the changes from https://gitlab.wikimedia.org/frankie/aqs-docker-test-env then start up the env and make sure it loads without error, using the normal steps:
  2. make startup
  3. wait for the "Startup Complete" message
  4. switch to another command line tab/window
  5. make bootstrap
  6. wait for data to load and confirm you don't see any error messages
  1. run the normal local tests to make sure nothing was broken in the existing data when adding the new data, using the normal steps:
  2. pull the latest Page Analytics code via git. If you don't have a copy of the Page Analytics gerrit repo, the clone command is here: https://gerrit.wikimedia.org/r/admin/repos/generated-data-platform/aqs/page-analytics,general
  3. make (to build the service)
  4. make test (to run unit tests)
  5. go run . (to execute the service)
  6. switch to another command line tab/window
  7. make itest (to run integration tests)

Really, all you're doing is confirming that adding the new data to the mock dataset didn't break anything that was already working.

Please note that one Page Analytics test was failing before, and (unless someone already fixed it and I didn't notice) it'll still fail after. That's not a reason for QA Fail on this task, because we expect it and will be fixing it under separate tasks. There just shouldn't be any new failures.

@BPirkle thanks for the detailed and concise explanation and breakdown!

Test status: QA PASS

newly added data have no effect on existing tests. Was able to run the integration tests