Page MenuHomePhabricator

Normalise the user agent column in CheckUser result tables
Open, Needs TriagePublic

Description

The user agent column in the CheckUser result table presents a lot of duplication. This is because, as described in T305930 and T326379, there is a lot of duplication in these columns. This has become even more of a problem since T295073: <Org-Wide Impact> Google Chrome User-Agent Deprecation Impact. For example, on enwiki there are about on average 200 rows to each distinct user agent string value. Some rough calculations suggest that by de-duplicating the column the cu_changes table on enwiki would be several gigabytes smaller.

As such, having one table to store the user agent strings used by rows in the CheckUser result tables and then referencing these rows by an ID would be a step towards normalising the result tables and would save a non-insignificant amount of space in the database.

Related Objects

StatusSubtypeAssignedTask
OpenFeatureNone
OpenFeatureNone
Resolved TBolliger
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenDreamy_Jazz
OpenNone
ResolvedDreamy_Jazz
OpenNone
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedMarostegui
OpenNone
ResolvedMarostegui

Event Timeline