**Objective**: To analyze the quality of the data collected with a new Metrics Platform-based instrument (`mediawiki_web_ui_actions`) by comparing that data to the data collected with the existing non-Metrics Platform-based instrument (`desktop_web_ui_actions`) and verify whether the migration to Metrics Platform can proceed or if there are any issues that need to be resolved.
## Prerequisites for data QA
- [] Mapping of old instrumentation to new instrumentation: <link to document/spreadsheet>
- This is primarily driven by engineer performing the migration
- Analyst reviews map for completeness and correctness, and contributes/assists as needed
- [] Specific QA needs have been identified and are agreed upon
- [] Analyst decides in collaboration with engineers/PM whether to QA the whole instrument, or if there are key parts that should be QAed and the rest can be assumed to be okay.
- [] Document the parts and the relevant queries:
- (1) overall counts by action and sub-action (if applicable)
- **if relevant** (2) counts by specific identifiers (e.g. by session)
- **NOTE**: these queries will almost always be limited by time in some way
- [X] New instrument has been deployed and activated ([[ https://phabricator.wikimedia.org/T351298 | Link to Phab task ]])
- [X] Engineer has verified that events are flowing in and that the instrument is not producing schema validation errors.
- [X] Verify that events are flowing in:
- [X] EventGate Grafana dashboard
- [X] Kafka by Topic Grafana dashboard
- [X] EventStreams
- [X] New instrument doesn't have any schema validation errors
- [] Prioritization agreement between analyst & PM of the QA work in the context of other needs/requests (e.g. PM may need to wait longer for some analysis so that the analyst can do the QA work)
- [] Documentation of old and new table names and date of deployment for analyst's reference:
| Instrument | Table name | Stream deployed (if applicable) |
|------------|------------|---------------------------------|
| Old | <name of table with data to compare to> | N/A |
| New | <name of table with data to compare> | <YYYY-MM-DD> |
## Data QA checklist
If more than one instrument is being migrated, these steps need to be completed for each one.
- [] Count the daily number of schema validation errors for
- [] Old instrument
- [] New instrument
- [] Compare counts of events by action and sub-action (as defined in the mapping from prerequisites)
- [] **if relevant** Compare counts by specific identifier (as defined in the prerequisites)
- [] Upload QA notebooks to Gitlab, making sure to follow [[ https://foundation.wikimedia.org/wiki/Legal:Data_publication_guidelines | data publication guidelines ]]
- [] Document any issues (or notable observations found) on this ticket
- [] Resolve this ticket
**NOTE**: If any issues were identified that require fixing the new instrument, data QA of the fixed instrument will need to be filed as a new Phab task. Some of the checked prerequisites will carry over.
------
See [[ https://docs.google.com/document/d/1sCD4I0zHn5BrL6TFB-Hh_YwZDafFKoJjZUrYfyQSQIM/edit | Metrics Platform Instrument Migration Data QA Process Description ]] for more details.