Goal
User story: As a user, it would be helpful if I could see a human-readable version of the User-agent string in the CheckUser interface so that I could easily see which OSs and browsers a user is using.
Often standard use-agent strings can be parsed into human-understandable information. For instance, Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0 indicates that the user is using Firefox version 55.0 on a computer with the 64-bit version of Windows 10. Some of the other parts (such as rv:55.0, Gecko/20100101, etc) typically give no additional information for the purposes of CheckUser, and can be ignored.
We should investigate having a user-agent parser (similar to what http://www.useragentstring.com/ does), that would show these basic human-understandable information to make it easier to interpret the UAs.
Acceptance criteria
- Given a UA string, split it into OS and browser
- Display the parsed information
- Add option to see complete UA if needed (mock tbd)
- If the UA is non-standard and cannot be split, display complete UA
Implementation Strategies
There may not be a single strategy that works perfectly, so using a combination of strategies might be best.
Parsing Libraries
One strategy could be to utilize a parsing library. Sadly, a lot of user agents lie, so it might not be completely accurate. Here are some example libraries that seem well maintained and might pass a security review:
UA Databases
Another strategy would be to utilize a database of User Agents. There does not appear to be a freely licensed database. :(
Theoretically, this database could be built on Wikidata, but as far as I know it doesn't currently exist. There is currently an RFC for how to manage software versions. It seems reasonable to add a user agent property that could be used on software versions. We could then probably build a bot that would take the user agents from the major browser's websites and insert them into Wikidata...