Page MenuHomePhabricator

Add index log_type_action
Closed, ResolvedPublic

Description

Schema Change:

All wikis
tables: logging
changes: Add index log_type_action
commit status: pending review - https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/445177/

CREATE INDEX /*i*/log_type_action ON /*_*/logging(log_type, log_action, log_timestamp);

Some wikis
tables: logging
changes: Drop index type_action
commit status: pending review - https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/445177/

DROP INDEX /*i*/type_action ON /*_*/logging;

Combined SQL for some wikis:

ALTER TABLE logging
DROP INDEX IF EXISTS type_action,
ADD INDEX IF NOT EXISTS log_type_action (log_type, log_action, log_timestamp);

Schema change progress

  • wikitech
    • labswiki
    • labtestwiki

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:52 AM
bzimport set Reference to bz49199.

The gerrit link seems wrong?

The index name has no log* prefix while other newer indexes on the table do. Do we definitely want it applied as-is?

Adding in Matthias and Brad for their thoughts (as they are other reviewers on that gerrit change).

btw, to see what I copied from, this is where things were tracked before:
https://wikitech.wikimedia.org/w/index.php?title=Schema_changes&oldid=72637 (that's an old version of the page).

That Gerrit change doesn't add the index, it just makes use it. The index was added r93211 (July 2011) and altered to its current form in r94239 (August 2011).

Since both of those were put into SVN by Reedy, let's bring him in on this.

(In reply to comment #1)

The index name has no log* prefix while other newer indexes on the table do.
Do we definitely want it applied as-is?

It should be renamed to match the current naming conventions, IMO.

After all the changes on the logging table related to the actor refactoring, what should we do with this ticket?
Is this still needed? I cannot find the gerrit change btw

After all the changes on the logging table related to the actor refactoring, what should we do with this ticket?
Is this still needed? I cannot find the gerrit change btw

It's in SVN!

https://www.mediawiki.org/wiki/Special:Code/MediaWiki/94239
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/93211

Per Brad, I added it badly named, so I guess I should fix that up. Noting some wikis will have no index, and some will have the badly named index

Change 445177 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Rename index type_action to log_ type_action

https://gerrit.wikimedia.org/r/445177

SVN!! \o/
So the correct one is this one?:

CREATE INDEX /*i*/log_type_action ON /*_*/logging(log_type, log_action, log_timestamp);

It should be renamed to match the current naming conventions, IMO.

There's quite a few of them... Including more recent ones... ;)


-- Special:Log type filter
CREATE INDEX /*i*/type_time ON /*_*/logging (log_type, log_timestamp);

-- Special:Log performer filter
CREATE INDEX /*i*/user_time ON /*_*/logging (log_user, log_timestamp);
CREATE INDEX /*i*/actor_time ON /*_*/logging (log_actor, log_timestamp);

-- Special:Log title filter, log extract
CREATE INDEX /*i*/page_time ON /*_*/logging (log_namespace, log_title, log_timestamp);

-- Special:Log unfiltered
CREATE INDEX /*i*/times ON /*_*/logging (log_timestamp);

SVN!! \o/
So the correct one is this one?:

CREATE INDEX /*i*/log_type_action ON /*_*/logging(log_type, log_action, log_timestamp);

Yup! With a WIP patch to make it so on MediaWiki itself

You'll want to drop type_action at the same time if it exists...

After all the changes on the logging table related to the actor refactoring, what should we do with this ticket?

This index is unrelated to any of the actor table changes.

Is this still needed?

We should either create the index (with the modern-style name) in prod or remove it from MediaWiki's tables.sql.

It looks like T71713 would probably still benefit from the index.

You'll want to drop type_action at the same time if it exists...

type_action is still present in tables.sql
https://phabricator.wikimedia.org/source/mediawiki/browse/master/maintenance/tables.sql$1661

You'll want to drop type_action at the same time if it exists...

type_action is still present in tables.sql
https://phabricator.wikimedia.org/source/mediawiki/browse/master/maintenance/tables.sql$1661

Change 445177 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Rename index type_action to log_ type_action

https://gerrit.wikimedia.org/r/445177

Cool! Once merged we can probably proceed then.
Would be also good to change the original description to match a bit the normal schema change template to avoid confusions: https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change

Reedy renamed this task from Add index type_action to Add index log_type_action.Jul 11 2018, 3:04 PM
Reedy updated the task description. (Show Details)
Reedy updated the task description. (Show Details)

Change 445177 merged by jenkins-bot:
[mediawiki/core@master] Rename index type_action to log_type_action

https://gerrit.wikimedia.org/r/445177

Merged into master, good to go now! :)

I have been taking a look at these indexes on enwiki, and we have two indexes in production that are not on tables.sql:

KEY `log_title_time` (`log_title`(16),`log_timestamp`),
KEY `log_title_type_time` (`log_title`(16),`log_type`,`log_timestamp`),

They look to be used, at least they are not being reported as unused on sys.schema_unused_indexes, so not sure if we should delete them, maybe commit them to tables.sql for now to make sure we have them there and have tables.sql consistent with what we have live, and later on a decide if we really need them dropped.

Is there any way to find which queries are using them? The (16) prefix on those strikes me as particularly odd, as does the fact that it's not (log_namespace, log_title(16), ...).

I'd think that queries using "log_title_time" in particular could probably use the existing page_time index instead, and anything using "log_title_type_time" could probably use a more sensible-seeming index on (log_namespace, log_title, log_type, log_timestamp).

Is there any way to find which queries are using them?

It should be available for hits since db start on sys/P_S, which access is blocked on T195578

It is certainly being used for some queries, I can see this counter increasing:

root@db1089.eqiad.wmnet[sys]> select rows_selected,select_latency from x$schema_index_statistics where table_name='logging' and index_name like 'log_title_%';
+---------------+----------------+
| rows_selected | select_latency |
+---------------+----------------+
|         14478 |  2739429620928 |
|           225 |    98277884848 |
+---------------+----------------+
2 rows in set (0.05 sec)

I don't doubt that it is being used. But as we've seen elsewhere the planner can sometimes make strange decisions. ;) Even if the planner's decision isn't strange, we might find that page_time and the other potential index mentioned in T51199#4969053 would work as well or better for the queries being made.

Can some of the queries from sys/P_S be copied here, or into a private paste, so I can look at them without waiting on T195578?

I don't think this is too useful https://phabricator.wikimedia.org/P8114 :(

Are you sure those are queries using the log_title_time or log_title_type_time indexes on logging? Those all look like they should be using PRIMARY for the logging table. None even have a condition on log_title.

No, those are all the logged queries involving the logging table that where logged on sys, that's why I said I didn't think it would be too useful :(

After analyzing the indexes on T217397 and testing a few hosts (T217397#4997997 and T217397#5010633) with the exact table definition on tables.sql, we are going to unify logging table based on tables.sql indexes.

The following hosts on s1 eqiad have the indexes unified with tables.sql (T217397#5010633)

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • dbstore1001
  • db1124
  • db1119
  • db1118
  • db1106
  • db1105
  • db1099
  • db1089
  • db1083
  • db1080
  • db1067

The following hosts on s4 eqiad have the indexes unified with tables.sql (T217397#5010633)

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1125
  • db1121
  • db1103
  • db1102
  • db1097
  • db1091
  • db1084
  • db1081
  • db1068

The following hosts on s5 eqiad have the indexes unified with tables.sql (T217397#5010633)

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1124
  • db1113
  • db1110
  • db1102
  • db1100
  • db1097
  • db1096
  • db1082
  • db1070

Mentioned in SAL (#wikimedia-operations) [2019-03-12T07:18:34Z] <marostegui> Deploy schema change on db2052 (s5 codfw master), this will generate lag on codfw T71127 T51199

s6 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1005
  • dbstore1001
  • db1125
  • db1113
  • db1098
  • db1096
  • db1093
  • db1088
  • db1085
  • db1061

s8 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1005
  • db1124
  • db1116
  • db1109
  • db1104
  • db1101
  • db1099
  • db1092
  • db1087
  • db1071

s7 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1125
  • db1116
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079
  • db1062

s2 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1125
  • db1122
  • db1105
  • db1103
  • db1095
  • db1090
  • db1076
  • db1074
  • db1066

s3 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1124
  • db1123
  • db1095
  • db1075
  • db1077
  • db1078
Marostegui updated the task description. (Show Details)

All done! One less drift between production and tables.sql