I'd like to request the addition of user Central ID to mediawiki_history table in Hive. It is not urgent, but I think it would greatly improve work for many users who analyze that table. A couple of other people who frequently use the mediawiki_history table also expressed interest in having this.
Currently, mediawiki_history table has wiki_db and event_user_id that together identify a unique user per wiki database, but there's no way of analyzing users across wiki databases without joining on a Central ID from another table. The table that contains Central ID is centralauth.localuser in MariaDB. Both tables are very large and located separately, so there's no way to easily join that information, other than for small subsets.
For context:
- slack discussion that lead to this request, with a use case example
- request for this task in the data-products slack channel