Username to user_id match is inconsistent in revisions of dump.
This could be a characteristic of how and when the username field gets updated in the revision table. If so, it would be nice to have a clear explanation of what things to expect (e.g. deleted users, username changes, etc).
We see a range of inconsistencies along the lines of many usernames matched with the same ID, many IDs matched with the same username, non-ip usernames with no ID and completely missing user information.
Our approach is to associate a user_id with its most recent username and propagate this username to all instances of user_id.
- Run SQL query to synchronize usernames with userids.
- Run SQL query to replace cases hostname is the username.