Page MenuHomePhabricator

Some tables lack unique or primary keys, may allow confusing duplicate data
Closed, ResolvedPublic

Description

All database tables should have PRIMARY KEY's, or at minimum, a UNIQUE index

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.
StatusSubtypeAssignedTask
InvalidNone
DuplicateNone
ResolvedNone
Resolved RobLa-WMF
ResolvedNone
ResolvedLSobanski
ResolvedNone
DeclinedNone
Resolveddaniel
Resolvedmatthiasmullie
ResolvedTTO
ResolvedMarostegui
ResolvedMarostegui
ResolvedAddshore
StalledNone
OpenNone
OpenNone
ResolvedMarostegui
ResolvedReedy
ResolvedReedy
ResolvedMarostegui
OpenLadsgroup
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedReedy
ResolvedMarostegui
ResolvedSep 9 2018Marostegui
ResolvedMarostegui
DuplicateNone
ResolvedDannyS712
ResolvedUmherirrender
ResolvedDannyS712
ResolvedDannyS712
DeclinedNone
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Most are not too large, but oldimage will cause problem on commons in the future. We'll see.

ls_field_val is used in a FORCE INDEX, which is causing errors like:

[Exception Wikimedia\Rdbms\DBQueryError] (/srv/mediawiki/php-1.29.0-wmf.21/includes/libs/rdbms/database/Database.php:1075) A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: SELECT  log_id,log_type,log_action,log_timestamp,log_user,log_user_text,log_namespace,log_title,log_comment,log_params,log_deleted,user_id,user_name,user_editcount,(SELECT  GROUP_CONCAT(ct_tag SEPARATOR ',')  FROM `change_tag`    WHERE ct_log_id=log_id  ) AS `ts_tags`  FROM `logging` FORCE INDEX (PRIMARY) LEFT JOIN `user` ON ((log_user=user_id)) INNER JOIN `log_search` FORCE INDEX (ls_field_val) ON ((ls_log_id=log_id))   WHERE log_type = 'delete' AND log_action = 'revision' AND ls_field = 'rev_id' AND ls_value = 'X' AND (log_type != 'suppress') AND log_namespace = 'X' AND log_title = 'X' AND ((log_deleted & 9) != 9)  ORDER BY log_timestamp DESC LIMIT 26  
Function: IndexPager::buildQueryInfo (LogPager)
Error: 1176 Key 'ls_field_val' doesn't exist in table 'log_search' (10.64.0.206)

  #0 /srv/mediawiki/php-1.29.0-wmf.21/includes/libs/rdbms/database/Database.php(933): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
  #1 /srv/mediawiki/php-1.29.0-wmf.21/includes/libs/rdbms/database/Database.php(1269): Wikimedia\Rdbms\Database->query(string, string)
  #2 /srv/mediawiki/php-1.29.0-wmf.21/includes/pager/IndexPager.php(368): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
  #3 /srv/mediawiki/php-1.29.0-wmf.21/includes/pager/IndexPager.php(225): IndexPager->reallyDoQuery(string, integer, boolean)
  #4 /srv/mediawiki/php-1.29.0-wmf.21/includes/logging/LogPager.php(428): IndexPager->doQuery()
  #5 /srv/mediawiki/php-1.29.0-wmf.21/includes/pager/IndexPager.php(422): LogPager->doQuery()
  #6 /srv/mediawiki/php-1.29.0-wmf.21/includes/logging/LogEventsList.php(623): IndexPager->getBody()
  #7 /srv/mediawiki/php-1.29.0-wmf.21/includes/specials/SpecialRevisiondelete.php(224): LogEventsList::showLogExtract(OutputPage, string, Title, string, array)
  #8 /srv/mediawiki/php-1.29.0-wmf.21/includes/specialpage/SpecialPage.php(522): SpecialRevisionDelete->execute(NULL)
  #9 /srv/mediawiki/php-1.29.0-wmf.21/includes/specialpage/SpecialPageFactory.php(578): SpecialPage->run(NULL)
  #10 /srv/mediawiki/php-1.29.0-wmf.21/includes/MediaWiki.php(287): SpecialPageFactory::executePath(Title, RequestContext)
  #11 /srv/mediawiki/php-1.29.0-wmf.21/includes/MediaWiki.php(862): MediaWiki->performRequest()
  #12 /srv/mediawiki/php-1.29.0-wmf.21/includes/MediaWiki.php(523): MediaWiki->main()
  #13 /srv/mediawiki/php-1.29.0-wmf.21/index.php(43): MediaWiki->run()
  #14 /srv/mediawiki/w/index.php(3): include(string)
  #15 {main}

Change 351653 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Use IGNORE INDEX(ls_log_id) instead of FORCE INDEX(ls_field_val)

https://gerrit.wikimedia.org/r/351653

Mentioned in SAL (#wikimedia-operations) [2017-05-03T15:13:06Z] <catrope@naos> Synchronized php-1.29.0-wmf.21/includes/logging/LogPager.php: Replace FORCE INDEX(ls_field_val) with IGNORE INDEX(ls_log_id) (https://gerrit.wikimedia.org/r/#/c/351653/ for T17441) (duration: 01m 14s)

Change 351653 merged by jenkins-bot:
[mediawiki/core@master] Use IGNORE INDEX(ls_log_id) instead of FORCE INDEX(ls_field_val)

https://gerrit.wikimedia.org/r/351653

Change 351705 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@wmf/1.29.0-wmf.21] Use IGNORE INDEX(ls_log_id) instead of FORCE INDEX(ls_field_val)

https://gerrit.wikimedia.org/r/351705

Change 351705 merged by jenkins-bot:
[mediawiki/core@wmf/1.29.0-wmf.21] Use IGNORE INDEX(ls_log_id) instead of FORCE INDEX(ls_field_val)

https://gerrit.wikimedia.org/r/351705

Change 370190 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[mediawiki/core@master] tables.sql: Convert UNIQUE keys into PK

https://gerrit.wikimedia.org/r/370190

I would like someone to help getting this: https://gerrit.wikimedia.org/r/#/c/370190/ merged and deployed.

To sum up, we have finished converting UNIQUE into PRIMARY keys and we would like this to be reflected on tables.sql so future wikis are created with the same PKs that we have deployed in core.

Thanks!

So... After 1.30.0-wmf.17 is everywhere, our DBA can drop the other two primary indexes that were blocked on a code change

ALTER TABLE /*_*/pagelinks DROP INDEX /*i*/pl_from;
ALTER TABLE /*_*/templatelinks DROP INDEX /*i*/tl_from;

hitcounter removed from core. externalinks now has el_id int unsigned NOT NULL PRIMARY KEY AUTO_INCREMENT,

I suspect archive and oldimage should be left alone due to the other major db overhauls to be done...

So querycache querycachetwo user_newtalk probably still want sorting?

Thank you very much @Reedy , I will create a separate task for deployment tracking- so we do not spam wmf-deployment related tasks on this mediawiki task: T174509

@Reedy I understand that T162774 should be resolved, too?

@Reedy I understand that T162774 should be resolved, too?

I think so, yup. Will do so

I suspect archive and oldimage should be left alone due to the other major db overhauls to be done...

Archive should have ar_id already- if it doesn't have on wmf, it would be a deployment error, but I belive that is not the case or at least it is not widespread.

For image I would ask for an estimation to reformat the image tables -if anyone plans to work on it soon-, if it is long term it may be reasonable to deploy a PK shorter term (it may even facilitate further schema changes). Edit: it already has img_name ?

I have yet to see querycache querycachetwo user_newtalk, but I suspect they may be much easier to fix and deploy, and not much of a problem.

Image does have img_name

oldimage also has oi_name, but in this case, it's not unique, as an image can have multiple revisions... oi_archive_name may work (but is quite a long string)

querycache has a patch... querycachetwo doesn't yet, neither does user_newtalk (need to decide what to use as a PK, likely a composite)

If we use qc_type, qc_value for querycache... We can probably use qcc_value, qcc_type for querycachetwo

Oh, sorry, I confused image and oldimage, of course, silly me.

Still a work in progress.

LSobanski claimed this task.
LSobanski subscribed.

This task is too large in scope. Tasks for individual tables exist and can be tackled separately.