One of the pain points:
Lack of discoverability & documentation – fields in schemas poorly documented, bugs / known issues not documented or hard to find, key details brief or missing, available data sources not documented/difficult to browse, undocumented data quality issues (especially around datetimes) requiring workarounds
While there is some documentation on wikitech:Analytics/Data Lake, it's by no means exhaustive. Event data is missing from there, data dictionaries are not complete, known issues are usually not documented. This effort is about addressing the discoverability & documentation pain point to improve data dexterity.