Data Platform Request Form
Is this a request for a:
- Dataset
- Data Pipeline
- Data Feature
Is this a change to something existing:
- Yes - please provide details of existing datasets/data pipelines (wiki links, Git URL, names of jobs, etc)
- No
Please provide the description of your request:
Please make available the centralauth.globaluser table in Data Lake.
This request would solve the same use case for me as this older ticket, but it would solve it more broadly, so I think the solution to this request would be better.
Use Case: (Please briefly explain what this feature will be used for):
If we have this table in Data Lake, then we can easily join it on to tables like the mediawiki_history table, so that:
- We can use the unchangeable global user ids as global identifiers instead of using the changeable usernames for that purpose. This will make the analyses that use unique global users more reliable.
- We can join data that is by global user id onto tables that are by username and do not have global user ids, like the mediawiki_history table.
- Example use case for this: I frequently need to join the Events Registration data that is by global user id onto the mediawiki_history table, which does not have it. For my current workaround for this, see the "Add local user id and wiki per global user id" section in this notebook
Ideal Delivery Date:
No specific delivery date, but sooner would be better.