On enwiki, the user_properties table has about 60M rows, for only 20M users. This is incredible considering that user_properties is meant to only store non-default options, to reduce DB space. The index length is about 2.2 GB, and the data size is about 3.7GB.
By sampling, the number of user_properties rows per user can be estimated. The problem is very dependent on user_id, and is mostly confined to user_id values less than 10M, i.e. users created before mid-2009.
Sampling 1000 users with user_id<10M, we find that the main culprits are:
searchNs-1 : 968 users
skin: 964 users
thumbsize: 912 users
75% of the skin rows have an empty string as their value, which causes Skin::newFromKey() to return the default skin, same as if the row was missing. The rest are mostly "monobook", presumably manually set via the UsabilityInitiative OptIn extension.
"searchNs-1" is a bug, it relates to searching the special namespace, which is not possible. It is "0" in all sampled rows.
"thumbsize" is "3" in all sampled rows, which is not the default, the default is "4" on all WMF wikis other than svwiki. In addition to bloat of the user_properties table, this causes fragmentation of the parser cache. There's no way 91% of users prior to 2009 manually set this value, it must have been set by a bug.
We should remove unnecessary or incorrectly inserted rows, and ensure that this does not happen again (e.g. as a consequence of the resolution of T38316).