Page MenuHomePhabricator

Remove ss_row_id column from site_stats table
Closed, DeclinedPublic

Description

Since there will always be just one row in the site_stats table, the "ss_row_id" column should be completely removed. It is pointless.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 4 2017, 8:29 PM
GeoffreyT2000 renamed this task from Remove id field from site_stats table to Remove id column from site_stats table.Sep 4 2017, 8:35 PM
GeoffreyT2000 updated the task description. (Show Details)
Krinkle added a subscriber: Krinkle.EditedSep 4 2017, 8:49 PM

See also T56888: Fresh MW lists "-1 recent contributors" / Overhaul site_stats table

@GeoffreyT2000 At a more basic level, it might be more appropiate for this data not to live in its own database table. Or to significantly alter the table to become key/value based (e.g. instead of one row with columns for each key, it could have a key/value column only with rows for each pair).

row_id PRIMARY KEYtotal_editstotal_pagesusersactive_users
1100505020
key PRIMARY KEYvalue
total_edits100
total_pages50
users50
active_users20

However, as long as it is column-based it is best to have a primary key. Otherwise, it becomes considerably more difficult to correctly perform insertions or updates on the table. Without an ID column, the only way to retrieve the data would have to do an unrestricted query to SELECT everything, and limit it to a single result. If for some reason multiple rows do exist in this table, at this point you would have no guarantee which row is returned, and there would be no good way to ensure a change in an UPDATE query will affect the same row returned by the SELECT query.

With an ID column, this is resolved by locking on the ID. In this case, we enforce there only being one row by using the generic ID "1", or which ever ID is selected by the database backend.

A wider overhaul for the site_stats table to become key/value pair based seems reasonable, however given how small and simple this feature is, I suspect it might not justify the required effort to make a breaking change to this table only to rid it of an awkward primary key.

Krinkle renamed this task from Remove id column from site_stats table to Remove ss_row_id column from site_stats table.Sep 4 2017, 8:56 PM
Krinkle triaged this task as Normal priority.
Krinkle moved this task from Untriaged to Schema changes on the Wikimedia-Rdbms board.
Reedy added a subscriber: Reedy.Sep 5 2017, 11:39 AM

It's interesting that something so simple can apparently offend someone so much...

There's certainly scope for some use cases to make use of this; it could be possible that a wiki farm or similar, that used shared tables could reuse this table, rather than having multiple, for example

T172514: Add new PKs to tables.sql (mysql and sqlite) T17441: Some tables lack unique or primary keys, may allow confusing duplicate data

Propose to decline this task, unless we mark it as a low priority task to overhaul it completely as per Timo?

demon closed this task as Declined.Nov 15 2017, 2:27 AM
demon added a subscriber: demon.

There's an argument to be made for supporting multiple wikis in the same table, but in practice we don't support that and the primary key would need to be something like wfWikiId() anyway. Timo's idea about making it key/value based (and in which case we'd possibly want to extend the SpecialStatsAddExtra hook to allow writing back to this table).

But both of those aren't what's being asked for here. The column hurts nobody and has some upsides that Timo mentioned.

Propose to decline this task

Done.