Page MenuHomePhabricator

geotags table is wasting a lot of auto_increment values
Closed, ResolvedPublic

Description

In dewiki, total number of rows in geotags table is around 1.8M but its auto increment value is 2.4B and while it's unsinged it's the highest ratio of that value to maximum value in the whole infra (context: T291332). If it continues like this, it will fully and unfixably break in a couple of years.

Possible causes or solutions:

  • It might be using upsert() a lot. Upsert is a terrible thing and every time it has to do update instead of insert, it wastes an auto-increment value.
  • There is something inherently wrong in the code, maybe it's pushing gt_id? because dewiki doesn't have 2.4B edit altogether to cause that many upserts.
  • Increasing the type to bigint and sweeping the problem under the rug.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The extension does not use upsert.
Its delete + insert every coordinate on every LinksUpdate (normal edit or nulledit), because the compare if one coordinate is equal to another coordinate is strict comparing ints with strings (dim) and that is always false. Using weak compare work. Needs some work to get the correct types into the Coord class for compare.

Change 742272 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/extensions/GeoData@master] Cast database field to required scalar types in Coord::newFromRow

https://gerrit.wikimedia.org/r/742272

The extension does not use upsert.
Its delete + insert every coordinate on every LinksUpdate (normal edit or nulledit), because the compare if one coordinate is equal to another coordinate is strict comparing ints with strings (dim) and that is always false. Using weak compare work. Needs some work to get the correct types into the Coord class for compare.

I just love it when I report a bug and it magically gets a fix <3 Thanks!

Change 742272 merged by jenkins-bot:

[mediawiki/extensions/GeoData@master] Cast database field to required scalar types in Coord::newFromRow

https://gerrit.wikimedia.org/r/742272

Great find, but shouldn't we really add a testcase for this when we fix it ?

Umherirrender claimed this task.

The extension does not use upsert.
Its delete + insert every coordinate on every LinksUpdate (normal edit or nulledit), because the compare if one coordinate is equal to another coordinate is strict comparing ints with strings (dim) and that is always false. Using weak compare work. Needs some work to get the correct types into the Coord class for compare.

I just love it when I report a bug and it magically gets a fix <3 Thanks!

That's a bug tracker is for ;-)

Great find, but shouldn't we really add a testcase for this when we fix it ?

There are already tests for the equalTo and fullyEqualTo function, but not in combination with database rows, where typical all fields are strings (Coord::newFromRow).