Page MenuHomePhabricator

Data QA of wikilambda_zobject_join table
Closed, ResolvedPublic

Description

Description

This new wikilambda_zobject_join WikiLambda table (created by T357552) will be sqoop'ed into the Data Lake, once the data transfer patches are deployed, and will be used for certain Wikifunctions "inventory metrics", such as T355637 and T355638. The location of the table is given in this patch

We will need to QA the aggregate data once we start receiving data to confirm data is logged as expected once in production.

The 3 data transfer tasks (declaring the table in the Analytics environment, adding the table to sqoop config; triggering the sqoop job every 24 hours) are all children of T363439, as is this task.

Event Timeline

The table contents look good and the table grows a reasonable amount every 24 hours as expected. The number of functions recorded in the table is consistently very close to the number of functions listed at https://www.wikifunctions.org/wiki/Special:ListObjectsByType/Z8 - but there usually seems to be few more functions in the table than on the wiki listing, which is a bit puzzling.

That small discrepancy is gone today, and when it did occur there are several possible explanations, so closing this. Will continue to keep an eye on it.