Page MenuHomePhabricator

Fresh MW lists "-1 recent contributors" / Overhaul site_stats table
Closed, ResolvedPublic

Description

On Special:UserLogin/signup it shows "-1 recent contributors.

I installed MediaWiki-Vagrant and when I go to create an account on my new wiki, in [[Special:UserLogin/signup]] I am told that "devwiki is made by people like you. 1 edit, 0 pages, -1 recent contributors".

Attached:


NOTE: The main bug reported in this task was resolved and fixed by https://gerrit.wikimedia.org/r/349770. The bug fix has been released in 1.29.0 and 1.30.0.

This task remains open due to related follow-up refactoring that is still being reviewed.

See also:

Details

Reference
bz54888

Event Timeline

bzimport raised the priority of this task from to Low.
bzimport set Reference to bz54888.
bzimport added a subscriber: Unknown Object (MLST).
Deskana created this task.Oct 2 2013, 6:29 PM

swalling wrote:

I think this is because a developer counts as a negative contributor. ;)

All jokes aside, that is an odd bug, and if it occurs on a regular fresh MediaWiki install as well, it'd be nice to get rid of it.

I saw this recently when re-installing a test wiki.

Thanks for filing this bug, Dan.

swalling wrote:

Note that according to our spec this third item calls the {{NUMBEROFACTIVEUSERS}} magic word, so the error may actually be with that. You can test by calling that in a wiki page and seeing if you get the same result.

I think you're right, Steven. I've now created an account on the wiki so I can't verify that myself, but given that both that magic word and the page display zero, it seems likely.

brion added a comment.Oct 3 2013, 3:30 PM

It was a little tricky to track this down because the code doesn't reference messages fully by name, but I can confirm that it is using {{NUMBEROFACTIVEUSERS}} magic word:

'createacct-benefit-head3'        => '{{NUMBEROFACTIVEUSERS}}', # do not translate or duplicate this message to other languages

That magic word calls SiteStats::activeUsers(), which simply pulls an existing number out of the site_stats table:

static function activeUsers() {

		self::load();
		return self::$row->ss_active_users;

}

Looks like it's not updating invalid default data, or something.

ori added a comment.Oct 3 2013, 5:12 PM

(In reply to comment #5)

Looks like it's not updating invalid default data, or something.

$ cat ./maintenance/archives/patch-ss_active_users.sql

  • More statistics, for version 1.14

ALTER TABLE /*$wgDBprefix*/site_stats ADD ss_active_users bigint default '-1';

...wat.

I guess -1 was chosen to clearly differentiate between 'no stats available' and 'stat available and its value is zero', but I'd have guessed that that's what NULL is for.

Isn't Special:Statistics supposed to update ss_active_users if it hasn't been updated in the past 24 hours? It looks for the 'activeusers-updated' key and then runs SiteStatsUpdate::cacheUpdate().

saper added a subscriber: saper.Oct 28 2015, 10:17 AM

As of https://git.wikimedia.org/tree/mediawiki%2Fcore.git%2F.git/2cb813ff13f1b005fbb106b40fb9262430a528e1 I am getting:

<div class="mw-createacct-benefits-container">
		<h2>YbabelWiki is made by people like you.</h2>
		<div class="mw-createacct-benefits-list">
							<div class="mw-number-text icon-edits">
					<h3>1</h3>
					<p>edit</p>
				</div>
							<div class="mw-number-text icon-pages">
					<h3>0</h3>
					<p>pages</p>
				</div>
							<div class="mw-number-text icon-contributors">
					<h3>0</h3>
					<p>recent contributors</p>
				</div>
					</div>
	</div>

which looks good. special_stats has:

       ss_row_id: 1
  ss_total_edits: 1
ss_good_articles: 0
  ss_total_pages: 1
        ss_users: 1
 ss_active_users: 0
       ss_images: 0

Installed with a CLI installer.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 28 2015, 10:17 AM
saper set Security to None.

Yes, with happens with the Web Installer:

<div class="mw-createacct-benefits-container">
		<h2>YbabelWeb is made by people like you.</h2>
		<div class="mw-createacct-benefits-list">
							<div class="mw-number-text icon-edits">
					<h3>0</h3>
					<p>edits</p>
				</div>
							<div class="mw-number-text icon-pages">
					<h3>0</h3>
					<p>pages</p>
				</div>
							<div class="mw-number-text icon-contributors">
					<h3>-1</h3>
					<p>recent contributor</p>
				</div>
					</div>
	</div>
saper added a comment.EditedOct 28 2015, 10:48 AM

installer has a step to insert `array(

    'ss_row_id' => 1,
    'ss_total_edits' => 0,
    'ss_good_articles' => 0,
    'ss_total_pages' => 0,
    'ss_users' => 0,
    'ss_images' => 0
),`

into site_stats. ss_active_users is missing... but CLI installer get it right anyway (even before it runs DatabaseUpdater)

Change 342977 had a related patch set uploaded (by MZMcBride):
[mediawiki/core] Fix "-1 recent contributors" by setting a default of 1 instead

https://gerrit.wikimedia.org/r/342977

EddieGP claimed this task.Mar 30 2017, 9:10 PM

Change 342977 merged by jenkins-bot:
[mediawiki/core@master] Set default to 1 recent contributor instead of -1

https://gerrit.wikimedia.org/r/342977

matmarex closed this task as Resolved.Apr 21 2017, 6:21 PM
matmarex removed a project: Patch-For-Review.

Only took us a little under four years \o/

Well, there had to be 5 lines of code touched in 5 different files, four of them doing such horrible complicated things like exchanging -1 by 1. Obviously there is no way this could ever somehow have been fixed earlier ;-)

Confirmed it working: Coincidentally I'm just playing around with vagrant (Why don't virtualboxes just work ..well.. out of the box?). Had -1 recent contributors shown on Special:CreateAccount yesterday, just finished setting up a new box a few minutes ago, Special:CreateAccount now correctly shows "1 recent contributor". So everything fine, just as expected :-)

Yay! :D Thank you both for getting this fixed.

Change 342977 merged by jenkins-bot:
[mediawiki/core@master] Set default to 1 recent contributor instead of -1

https://gerrit.wikimedia.org/r/342977

This seems to have made things more inconsistent. In addition, it changes the default schema without a schema update - which is an anti-pattern, no matter how trivial it may be in this case, we shouldn't do that.

The actual default of -1 may seem odd, but is what we do with the other fields as well. We now have all fields set to 0 or -1, except active_users, which now oddly defaults to '1'. It also means the default now shows more active users than users in total, which doesn't seem right (users: 0, active_users: 1).

The fact that the sign up page -1 looked like an error because it was an error. The error is that the default row we create in the installer is missing the field for active_users. This was pointed out by @saper in T56888#1761319.

@MZMcBride Sorry, but I recommend we revert your patch and instead fix the row created by the installer to include a value for active users.

As it is, we now have three unresolved issues:

  • A merged schema schema without a db update for existing wikis.
  • An inconsistency for site_stats: Previously defaults were -1 or 0, active_users now oddly defaults 1. Also making active_users > users on a new install. This also affects {{NUMBEROFUSERS}}.
  • Code errors that insert a row without active_users trigger insertion of 1, whereas other code errors cause insertion of -1.
Krinkle reopened this task as Open.Apr 22 2017, 5:13 AM

Change 342977 merged by jenkins-bot:
[mediawiki/core@master] Set default to 1 recent contributor instead of -1

https://gerrit.wikimedia.org/r/342977

This seems to have made things more inconsistent. In addition, it changes the default schema without a schema update - which is an anti-pattern, no matter how trivial it may be in this case, we shouldn't do that.

Is just sucks that there is almost no code review activity on patchsets for weeks and then this kind of feedback annoyingly often comes in after something is merged rather than pointing to those (more or less obvious) things during the code review process, but that's another story (and of course not something to blame Krinkle for). So yeah, this definitely is an anti-pattern and we shouldn't do that.


For this very issue

The fact that the sign up page -1 looked like an error because it was an error. The error is that the default row we create in the installer is missing the field for active_users. This was pointed out by @saper in T56888#1761319.
@MZMcBride Sorry, but I recommend we revert your patch and instead fix the row created by the installer to include a value for active users.

I'm going to upload a patchset which does this. Should be fairly easy.

As it is, we now have three unresolved issues:

  • A merged schema schema without a db update for existing wikis.
  • An inconsistency for site_stats: Previously defaults were -1 or 0, active_users now oddly defaults 1. Also making active_users > users on a new install. This also affects {{NUMBEROFUSERS}}.
  • Code errors that insert a row without active_users trigger insertion of 1, whereas other code errors cause insertion of -1.

For site_stats in general

The actual default of -1 may seem odd, but is what we do with the other fields as well. We now have all fields set to 0 or -1, except active_users, which now oddly defaults to '1'. It also means the default now shows more active users than users in total, which doesn't seem right (users: 0, active_users: 1).

So, looking into tables.sql for site_stats we currently have:

ss_row_id int unsigned NOT NULL,
ss_total_edits bigint unsigned default 0,
ss_good_articles bigint unsigned default 0,
ss_total_pages bigint default '-1',
ss_users bigint default '-1',
ss_active_users bigint default '1',
ss_images int default 0

So this is inconsistent all over the place. At the time I'm creating a new wiki (i.e. a new database), why are there 0 articles and 0 pictures but -1 pages, -1 users and -1 active users by default?

We should have a follow-up that changes all of those so that we have an consistent schema. That patchset would have to include a proper schema change of course. That said, the decision is whether to put -1 or 0 here as a default.

While in theory we should never run in any case where there is no data in any of those fields, pratically mistakes and as such those things happen (as proven by this bug). So -1 is sensible in a way that it states "there is some odd mistake here, we have no data on this but we should have". So on the one hand, showing -1 recent contributor in the frontend looks horrible and should not happen, but in case it happens anyhow, showing -1 is better than 0 because there is no sense in it for a human reader: When reading -1 the reader directly knows "there is something broken here, I cannot rely on that value" where 0 might leave the impression that there is just nobody active on that wiki, this wiki has no pages at all etc. and at least takes a second look and little thought to notice that this actually is a mistake in the software.

On the other hand. 0 is sensible for the moment when a wiki is created, as it represents the statement "when creating a new database (i.e. a new wiki), there are 0 pages (of which 0 are good articles), 0 users (of which 0 are active), 0 images and 0 edits" here.

As one could've guessed by the amount of text I've been writing to give reasons for it, I'd prefer the former (-1). Any opinions?

Also feel free to move this to a new task if that's a more appropriate place for this.

gerritbot seems to be off for the weekend too, so here's the patch link: https://gerrit.wikimedia.org/r/#/c/349770/

The actual default of -1 may seem odd, but is what we do with the other fields as well. We now have all fields set to 0 or -1, except active_users, which now oddly defaults to '1'. It also means the default now shows more active users than users in total, which doesn't seem right (users: 0, active_users: 1).

This sounds like a reason to fix the users count. In the case of this task, there are different fields, so we can easily have different default values. This allows us to follow the principle of "do what I mean." Showing the text "-1 recent contributors" on the signup page violates this principle.

The fact that the sign up page -1 looked like an error because it was an error. The error is that the default row we create in the installer is missing the field for active_users. This was pointed out by @saper in T56888#1761319.

Is MediaWiki-Vagrant using the installer?

As it is, we now have three unresolved issues:

  • A merged schema schema without a db update for existing wikis.

How is this an issue? Existing wikis shouldn't be affected by a change in the default value and new wikis will have a better default value, meaning they won't display this very stupid and embarrassing "-1 recent contributors" text. This sounds good to me. Are you suggesting that existing wikis would want the default value to be 1 instead of -1? Why?

  • An inconsistency for site_stats: Previously defaults were -1 or 0, active_users now oddly defaults 1. Also making active_users > users on a new install. This also affects {{NUMBEROFUSERS}}.

Does this "inconsistency" matter? We have different fields and different default values for them. This is how schemas work.

Pointing to the user count being greater than active user count only highlights the incorrect user count. Why are we setting 'ss_users' => 0, in the installer?

  • Code errors that insert a row without active_users trigger insertion of 1, whereas other code errors cause insertion of -1.

I don't know what this means. What are code errors?

We could just get rid of the text on the signup page altogether. Does anyone still want it there? This wouldn't address the default values, but it would reduce the visibility dramatically.

@MZMcBride Those default values in the schema are only set if for some reason the row in site_stats is inserted without correctly inserting a value for this field. Theoretically this should never happen. We should never see code that does INSERT or REPLACE without explicitly setting all values of site_stats. Code that does only provide an update for specific rows should use UPDATE instead and not touch other fields. We also should never see code that does UPDATE a field to it's default value (explicitly saying UPDATE SET x DEFAULT(x) ). That is, during the installation process we're adding a single row to site_stats and initialize it with some values (which are all 0 at the moment). Afterwards, as said above, every code that updates/replaces this row should always only update explicitly some values and leaving the others untouched or (when replacing) adding real values for all fields.
In conclusion this does mean that the default value in the schema basically should never be used to define what will be in this field. If it is used anyways, some code updating/replacing/inserting it is wrong and needs to be changed. I guess this is why -1 was added as a default value for some fields in the first place: -1 makes absolutely no sense in a real world context and hence it is quite obvious that some code must be wrong when this value appears. That seems to be by design.

This sounds like a reason to fix the users count. In the case of this task, there are different fields, so we can easily have different default values. This allows us to follow the principle of "do what I mean." Showing the text "-1 recent contributors" on the signup page violates this principle.

Nobody wants to keep showing -1 recent contributors on the signup page. This is not about if we should solve this but about how to do that.

A merged schema schema without a db update for existing wikis.

How is this an issue?

As far as I know the DBA team is putting active effort into removing differences between the schema in HEAD and the schema on production servers. Also there seems to be basically some rule to always update the schema instead of just changing it for new wikis as not doing this will potetially create really odd issues. When you're running version X of mediawiki, your database schema should look identically no matter if you've just installed from scratch or updated from a older version. Also, see the explanation for code errors below, which has an example of the same code on the same codebase producing different errors, depending on how long the database already exists. So when I installed mediawiki one week ago from alpha and updated today to HEAD, with running some code like shown there, I'd get -1 inserted. When installing from scratch today and using the same HEAD, with the same code I'd insert 1. That's not something we want.

Code errors that insert a row without active_users trigger insertion of 1, whereas other code errors cause insertion of -1.

I don't know what this means. What are code errors?

Simple examples, given some extension produces wrong SQL requests which result into the following sent to the database:

INSERT INTO site_stats (ss_row_id, ss_total_edits, ss_good_articles, ss_total_pages, ss_users, ss_images) VALUES (1, 100, 5, 20, 7, 3)

is a code error as it does not update the ss_active_users field. It does trigger an insertion of 1 in the field ss_active_users on new wikis, because that's default value now. It does trigger a insertion of -1 in the field ss_active_users on older wikis, because we have no schema change for those applied.

INSERT INTO site_stats (ss_row_id, ss_total_edits, ss_good_articles, ss_total_pages, ss_users, ss_active_users) VALUES (1, 100, 5, 20, 7, 3)

is a code error as it does not update the ss_images field. It does trigger an insertion of the default: 0.

So for some fields, it will insert 0. For some, it will insert -1. And for ss_active_users, it will insert -1 or 1, depending on how old your mw installation is. That's just a awful mess.

@MZMcBride Those default values in the schema are only set if for some reason the row in site_stats is inserted without correctly inserting a value for this field. Theoretically this should never happen. We should never see code that does INSERT or REPLACE without explicitly setting all values of site_stats.

Then what is the point of having a default value at all? My understanding is that we specify a default value so that INSERT or REPLACE code that has not been changed will use a sane default value. In this case, we never want -1 to be inserted, so why specify it as a default value?

In conclusion this does mean that the default value in the schema basically should never be used to define what will be in this field. If it is used anyways, some code updating/replacing/inserting it is wrong and needs to be changed. I guess this is why -1 was added as a default value for some fields in the first place: -1 makes absolutely no sense in a real world context and hence it is quite obvious that some code must be wrong when this value appears. That seems to be by design.

This is insane. We should be a specifying default value that makes sense to insert when not explicitly set. We should not be setting traps and then exposing our traps to end-users when an INSERT fails to account for a new field.

Also, saying this is "quite obvious" and "by design" seems to ignore the empirical evidence that this bug has existed for years. If the goal was truly to fail loudly here and raise the visibility of this issue in order to get it fixed quickly, we've missed the mark dramatically.

As far as I know the DBA team is putting active effort into removing differences between the schema in HEAD and the schema on production servers. Also there seems to be basically some rule to always update the schema instead of just changing it for new wikis as not doing this will potetially create really odd issues. When you're running version X of mediawiki, your database schema should look identically no matter if you've just installed from scratch or updated from a older version.

The Wikimedia database team is interested in fixing substantive differences between schemas of Wikimedia wikis, such as different field types for the same field, different or missing indices, etc. This is not a substantive difference.

So when I installed mediawiki one week ago from alpha and updated today to HEAD, with running some code like shown there, I'd get -1 inserted. When installing from scratch today and using the same HEAD, with the same code I'd insert 1. That's not something we want.

You don't want a bug fix? Previously you got the wrong value inserted, now you're getting the correct value inserted. This is the "cost" of progress.

EddieGP added a comment.EditedApr 23 2017, 5:58 PM

You don't want a bug fix? Previously you got the wrong value inserted, now you're getting the correct value inserted. This is the "cost" of progress.

This is not what I said. I did not say "my wiki had this bug last week and doesn't have this bug now and that's bad". This is not about a comparison between "installed last week" vs. "installed today". It is about "installed last week and updated to latest version today" vs. "installed today". My sentence "That's not something we want." does not mean "I don't want this bug to be fixed.". Rather it tries to say "My database schema should look no different depending whether I upgraded my wiki from version X to version Y or built a new version Y wiki from scratch. At all, I'm running version Y in both cases!"


The Wikimedia database team is interested in fixing substantive differences between schemas of Wikimedia wikis, such as different field types for the same field, different or missing indices, etc. This is not a substantive difference.

Okay, agreed, my argument shouldn't really be with the DBA team. The point was that we should not drift how the schema looks on long-existing wikis (like WMF ones) and how it looks in tables.sql on master. Instead we should aim (and we do) to remove differences as far as anyhow possible. Of course fixing such things like this one wouldn't be priority for the DBA. But there is also no sense in introducing such new drifts. I agree with Krinkle here:

In addition, it changes the default schema without a schema update - which is an anti-pattern, no matter how trivial it may be in this case, we shouldn't do that.

Whenever the schema is to be changed, there should be both: A patch altering existing databases to the new schema and the change to the tables.sql files which are used for initializing new databases.


I don't know if it's clear enough, so I'll point it out again:
This task is about fixing the bug that on new wikis -1 recent contributors are shown. Why does this happen? It is because the default value is -1 in the schema AND the installer fails to define a value for ss_active_users when INSERTing the relevant row to the table. It's those two things playing together which are resulting in showing off -1 to end-users here. So there are two ways how to fix this: Change the default in the schema or change the installer not to fail INSERTing a value for ss_active_users. Your/our patch did the former. What Krinkle raised and what I agree to is that we should undo the former and instead do the latter. Yes, that would bring back -1 as the default in the schema. No, that would not bring back the -1 showing off to users.

This is because we see a couple of things that aren't good about the former (current) solution:

  • We're still having an INSERT in the installer that does not initialize a correct value at the place where it should.
  • We're changing the default schema to do something that is senseful at the point when a wiki gets created. The same default value doesn't need to be senseful when there is the same mistake in the code at some later point in time. If tomorrow some error-prone code REPLACEs the site_stats of the English Wikipedia and fails to correctly put something in ss_active_users, we're going to see the default value from the schema on Special:CreateAccount. What would be better to show off here: -1 or 1? While I agree that -1 is a awful way to indicate "something is wrong with the software", 1 doesn't indicate this at all. Instead, it gives the impression that just a single person is active on that wiki, while in reality thousands of people are. It is not easy to see that the 1 comes from an error in the software but it's "quite obvious" (and that's the only way how I meant that term) that -1 comes from a error in the software. Yes, chances are high that this is never going to happen. But why don't use the other solution when it deals with this case better?
  • We're drifting database schemas on existing and newly created wikis away from each other. Even if it's a really tiny difference, it is one, and - as said above - an unnecassary one.

Also, saying this is "quite obvious" and "by design" seems to ignore the empirical evidence that this bug has existed for years. If the goal was truly to fail loudly here and raise the visibility of this issue in order to get it fixed quickly, we've missed the mark dramatically.

It is quite obvious that this page showing -1 MUST be something wrong with the software. It seems to have been "by design" to fail loudly. That's how I meant this terms. And as far as I can see, the goal here "truly was to fail loudly and raise the visibility of this issue in order to get it" reported quickly. (Chances are that the ones who originally wrote the default of -1 in there and the people reviewing it weren't insane. They picked this value for a reason. It's reasonable to assume that this reason was to fail loudly instead of giving some senseful-looking number that most likely would be wrong.) Failing loudly will always only enable us to notice, report and investigate issues quickly. And yes, that usually leads to something being quickly fixed, but this "usually" does not mean "necessarily". Failing loudly itself will never directly result in something to get quickly fixed if just no developer cares to pick up a task for years. That's a problem we just can't expect code to solve. When something gets reported quickly and no developer cares to fix it (and that is what this tasks existence for years really proves) that is (IMHO) not a reason to stop failing loudly.

Proposal:

  • Use -1 as default for all site_stats field. Thus reverting https://gerrit.wikimedia.org/r/342977 which changed the default for the column instead of the actual row.
  • Fix installer. It already creates a row with the initial correct values, but didn't set active_users. That was the original bug and should be fixed by making that row include a value for active_users (presumably 1 since the installer usually creates the first user).
  • Include schema update for the schema change.

Is MediaWiki-Vagrant using the installer?

Yes.

Change 349770 had a related patch set uploaded (by Krinkle; owner: EddieGP):
[mediawiki/core@master] Revert, Follow-up: -1 recent contributors

https://gerrit.wikimedia.org/r/349770

Change 349770 merged by jenkins-bot:
[mediawiki/core@master] Revert, Follow-up: -1 recent contributors

https://gerrit.wikimedia.org/r/349770

Proposal:

Will work on that.

  • Fix installer. It already creates a row with the initial correct values, but didn't set active_users. That was the original bug and should be fixed by making that row include a value for active_users (presumably 1 since the installer usually creates the first user).

You did notice my revert/patch you merged already added at least something into the installer here? The only thing that differs right now is the "presumably 1" part, as it sets 0 currently, but that can easily be changed. Besides, from your comment on the revert

Ensure initial user creation by installer also increments users to 1 as expected.

I'm not sure if I'd should either initialize that table with value 1 for ss_users and ss_active_users (it's clear that we should do both, isn't it?) and don't touch site_stats at the point the user gets created or rather init it with all 0 and change values at the point the initial user gets created. IIRC you get to the step creating the user after the database was already initialized. Outcome should be the same though.

  • Include schema update for the schema change.

Goes with the first point. While at it it shouldn't be hard to add something for ss_active_users too, checking what the default value is right now and if it's not -1, change that. That'll catch the (I guess really few) databases initialized from alpha during this short period and as far as I can tell won't be more than a few lines of code similar to the ones to add for the other values anyways.

I think that'll be two patches to seperate the db part from the installer part to be able to delay, revert etc. the schema part if any problems make this necessary. Also wondering if we should file other tasks for this, as the thing originally outlined in this task is resolved.

Change 351012 had a related patch set uploaded (by Paladox; owner: EddieGP):
[mediawiki/core@REL1_29] Revert, Follow-up: -1 recent contributors

https://gerrit.wikimedia.org/r/351012

Change 351012 merged by jenkins-bot:
[mediawiki/core@REL1_29] Revert, Follow-up: -1 recent contributors

https://gerrit.wikimedia.org/r/351012

Correct me if I'm wrong, but it looks like we're now setting the field to 0, which perhaps isn't -1, but it's incorrect again?

Correct me if I'm wrong, but it looks like we're now setting the field to 0, which perhaps isn't -1, but it's incorrect again?

You're right and as said in T56888#3216917, I'm going to push a patch for this soon.

Proposal:

  • Use -1 as default for all site_stats field.

While at writing the patch and testing it locally in my vagrant box, I unfortunately noticed that ss_total_edits and ss_good_articles are "Type: bigint(20) unsigned", hence we can't set it's default value to -1. So we need to re-evaluate this, the only thing we will be able to ALTER SET DEFAULT -1 will be the ss_images field (and the ss_active_users field for those installations from alpha version perhaps made during the short period of setting 1 there). Looking at

DESCRIBE site_stats;
+------------------+---------------------+------+-----+---------+-------+
| Field            | Type                | Null | Key | Default | Extra |
+------------------+---------------------+------+-----+---------+-------+
| ss_row_id        | int(10) unsigned    | NO   | PRI | NULL    |       |
| ss_total_edits   | bigint(20) unsigned | YES  |     | 0       |       |
| ss_good_articles | bigint(20) unsigned | YES  |     | 0       |       |
| ss_total_pages   | bigint(20)          | YES  |     | -1      |       |
| ss_users         | bigint(20)          | YES  |     | -1      |       |
| ss_active_users  | bigint(20)          | YES  |     | -1      |       |
| ss_images        | int(11)             | YES  |     | 0       |       |
+------------------+---------------------+------+-----+---------+-------+

as they are all just numbers for statistics, I wonder if there is a good reason (as opposed to "because of history") to have bigint(20) unsigned on some, bigint(20) (signed) on some others and finally int(11) (signed) on ss_images or if this should be adressed as well. I don't see why we're thinking of the number of active users as some big number, but of the number of images as a small number. Also, unsigned seems senseful to me. Although it prevents us from using -1, values < 0 don't make any sense in those fields.

Also I note that all of these columns have Null: YES, so I wonder if it would be a bad idea to define NULL as the default value here? I'm no expert on databases, but as far as I got it "Null is [...] to indicate that a data value does not exist in the database." - which is exactly what the default value does if it is used, isn't it?

I think the possibly best thing to do is to alter the table to have Type: bigint(20) unsigned and Default: NULL on all columns, but I'm happy about all input I can get about this.

@EddieGP Thanks for the detailed analysis! Yep, agreed! I'd say make all the metric fields bigint unsigned with NULL as default. This will also make the schema change a bit more meaningful as opposed to merely changing a default.

EddieGP renamed this task from Fresh install of MediaWiki lists "-1 recent contributors" in Special:UserLogin/signup to Fresh MW lists "-1 recent contributors" / Overhaul site_stats table.May 5 2017, 7:34 PM
EddieGP added a comment.EditedMay 5 2017, 8:24 PM

As with the current implementation of the DatabaseUpdater class and subclasses, there is no function which allows to add schema changes that touch more than one column - so I would need to split up my patch from 5 sql files for 5 supported DBMS to 6 columns * 5 supported DBMS = 30 patch files. Fortunately, the mssql subclass has a function (that is not specific to mssql) that can be used to alter multiple columns - so by using that I could reduce the number from 30 to 25 necessary sql files. But I feel refactoring these classes is a better solution here, so I've proposed to move that function to DatabaseUpdater by pushing https://gerrit.wikimedia.org/r/#/c/352216/ up for review. To me it looks like there is no reason why this function should live in a subclass rather than in DatabaseUpdater itself. Moving it would allow me to use it in all five supported DBMS - reducing the amount of necessary sql files for the change to site_stats from 25 to 5. I guess further work on this schema change is blocked on this move (assuming nobody wants me to really push 25 oneliner sql files).

Krinkle moved this task from Unsorted to Change on the Schema-change board.May 8 2017, 12:44 AM

Change 352646 had a related patch set uploaded (by EddieGP; owner: EddieGP):
[mediawiki/core@master] Overhaul site_stats table

https://gerrit.wikimedia.org/r/352646

Change 367698 had a related patch set uploaded (by EddieGP; owner: EddieGP):
[mediawiki/core@master] Add primary keys to site_stats

https://gerrit.wikimedia.org/r/367698

Change 367698 abandoned by EddieGP:
Add primary keys to site_stats

Reason:
Fuck this.

https://gerrit.wikimedia.org/r/367698

Change 352646 abandoned by EddieGP:
[DNM] Overhaul site_stats table

Reason:
Obviously nobody cares. See my comment on https://gerrit.wikimedia.org/r/#/c/367698/

https://gerrit.wikimedia.org/r/352646

EddieGP removed EddieGP as the assignee of this task.Sep 22 2017, 8:36 PM

Change 367698 restored by EddieGP:
Add primary keys to site_stats

https://gerrit.wikimedia.org/r/367698

Change 367698 merged by jenkins-bot:
[mediawiki/core@master] Add primary keys to site_stats

https://gerrit.wikimedia.org/r/367698

Restricted Application added a subscriber: jhsoby. · View Herald TranscriptOct 17 2017, 2:01 PM

Change 384706 had a related patch set uploaded (by EddieGP; owner: EddieGP):
[mediawiki/core@REL1_30] Add primary keys to site_stats

https://gerrit.wikimedia.org/r/384706

Deskana removed a subscriber: Deskana.Oct 17 2017, 3:44 PM

Change 352646 restored by EddieGP:
[DNM] Overhaul site_stats table

https://gerrit.wikimedia.org/r/352646

Change 384706 merged by jenkins-bot:
[mediawiki/core@REL1_30] Add primary keys to site_stats

https://gerrit.wikimedia.org/r/384706

Change 352646 merged by jenkins-bot:
[mediawiki/core@master] Overhaul site_stats table

https://gerrit.wikimedia.org/r/352646

After merging this, left are the deployment to wm (T190780) and some changes to the docs on mw.o that I'll look into right away.

EddieGP closed this task as Resolved.Mar 27 2018, 8:08 AM
EddieGP claimed this task.