TranslateMetadata::get does full table scans unnecessarily
Closed, ResolvedPublic
Actions

Description

Even when visiting just a single translation unit page in action=edit, it does a full table scan on translate_metadata table. That is almost certainly unnecessary and probably has an adverse performance impact. It also probably won't scale in the long term. If there are places where it does need to do a full scan then that place should be using it, others like action=edit on single pages like this shouldn't do it.

Details

	Subject	Repo	Branch	Lines +/-
	Avoid table scans of translate_metadata when possible	mediawiki/extensions/Translate	master	+96 -53

Customize query in gerrit

Related Objects

Mentioned Here: T204026: DBPerformance warning "Query returned XXXX rows: query: SELECT * FROM `translate_metadata`"

Event Timeline

Glaisher created this task.Jun 24 2016, 5:14 PM

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 24 2016, 5:14 PM

Glaisher moved this task from Backlog to maintenance and operational issues on the MediaWiki-extensions-Translate board.Jun 24 2016, 5:15 PM

Originally the table was so small that it was easier and faster to just load everything instead of doing multiple queries. Most likely no longer so.

It could be replaced with simple caching layer that supports batching. Callers should be validated so that appropriate batching is done when required.

Nikerabbit triaged this task as Medium priority.Jun 25 2016, 1:04 PM

Some interesting query data:

SELECT tmd_key, COUNT(*) AS n, AVG(CHAR_LENGTH(tmd_value)) AS ave, SUM(CHAR_LENGTH(tmd_value)) AS total FROM translate_metadata GROUP BY tmd_key ORDER BY n DESC;
stdClass Object
(
    [tmd_key] => prioritylangs
    [n] => 17656
    [ave] => 0.8527
    [total] => 15056
)
stdClass Object
(
    [tmd_key] => maxid
    [n] => 6018
    [ave] => 1.5254
    [total] => 9180
)
stdClass Object
(
    [tmd_key] => priorityforce
    [n] => 278
    [ave] => 2.7518
    [total] => 765
)
stdClass Object
(
    [tmd_key] => priorityreason
    [n] => 278
    [ave] => 18.6151
    [total] => 5175
)
stdClass Object
(
    [tmd_key] => description
    [n] => 161
    [ave] => 33.3106
    [total] => 5363
)
stdClass Object
(
    [tmd_key] => name
    [n] => 161
    [ave] => 24.3043
    [total] => 3913
)
stdClass Object
(
    [tmd_key] => subgroups
    [n] => 161
    [ave] => 1609.7391
    [total] => 259168
)


> SELECT tmd_key, COUNT(*) AS n, AVG(CHAR_LENGTH(tmd_group)) AS ave, SUM(CHAR_LENGTH(tmd_group)) AS total FROM translate_metadata GROUP BY tmd_key ORDER BY n DESC;
stdClass Object
(
    [tmd_key] => prioritylangs
    [n] => 17656
    [ave] => 51.1121
    [total] => 902435
)
stdClass Object
(
    [tmd_key] => maxid
    [n] => 6018
    [ave] => 46.4762
    [total] => 279694
)
stdClass Object
(
    [tmd_key] => priorityforce
    [n] => 278
    [ave] => 50.7302
    [total] => 14103
)
stdClass Object
(
    [tmd_key] => priorityreason
    [n] => 278
    [ave] => 50.7302
    [total] => 14103
)
stdClass Object
(
    [tmd_key] => description
    [n] => 161
    [ave] => 27.4845
    [total] => 4425
)
stdClass Object
(
    [tmd_key] => name
    [n] => 161
    [ave] => 27.4845
    [total] => 4425
)
stdClass Object
(
    [tmd_key] => subgroups
    [n] => 161
    [ave] => 27.4845
    [total] => 4425
)

I'm not seeing a simple way to cache anything.

How many rows are typically needed in such a request? What about the number of TranslateMetadata::get calls too?

Change 494146 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/extensions/Translate@master] Avoid table scan translate_metadata queries by using batched preloading

https://gerrit.wikimedia.org/r/494146

gerritbot added a project: Patch-For-Review.Mar 3 2019, 11:04 PM

Krinkle added a project: Performance-Team (Radar).Mar 12 2019, 9:16 PM

Krinkle moved this task from Limbo to Perf recommendation on the Performance-Team (Radar) board.

Is this the same as T204026?

In T138619#5019310, @Krinkle wrote:

Is this the same as T204026?

More or less.

Change 494146 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Avoid table scans of translate_metadata when possible

https://gerrit.wikimedia.org/r/494146

ReleaseTaggerBot added a project: MW-1.33-notes (1.33.0-wmf.24; 2019-04-02).Mar 29 2019, 6:01 PM

Some cases still tale scan, but those are pages or API modules that aggregate everything, so it's not avoidable there.

TranslateMetadata::get does full table scans unnecessarilyClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

TranslateMetadata::get does full table scans unnecessarily
Closed, ResolvedPublic
Actions