Page MenuHomePhabricator

hidenondamaging is slow
Closed, ResolvedPublic

Description

It seems hidenondamaging in wikidatawiki is slow probably because there is no index on oresc_probability

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 15 2016, 4:28 PM

This is the query that activating ORES does, taking between 1 minute and 20 seconds to execute, including a DEPENDENT SUBQUERY and reading 5 million rows:

MariaDB  db1026 wikidatawiki > EXPLAIN SELECT /* SpecialRecentChanges::doMainQuery Ladsgroup */ 
rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,
rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,
rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest, 
(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,
oresc_probability,0.8 AS ores_threshold 
FROM `recentchanges` 
LEFT JOIN `watchlist` ON (wl_user = '28859' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) 
LEFT JOIN `page` ON ((rc_cur_id=page_id)) 
LEFT JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_is_predicted = 1 AND oresc_class = 1))
LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1))
WHERE rc_bot = '0' AND (rc_type != '6') AND (rc_timestamp >= '20160608000000') AND (rc_source != 'wb') AND 
((oresm_name = 'damaging' OR oresm_name IS NULL)) AND oresc_is_predicted = '1' AND (oresc_probability > '0.8') 
AND rc_patrolled = '0' AND rc_new IN ('0','1') 
ORDER BY rc_timestamp DESC 
LIMIT 50\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: recentchanges
         type: ref
possible_keys: rc_timestamp,new_name_timestamp,tmp_2
          key: tmp_2
      key_len: 1
          ref: const
         rows: 5700598
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: watchlist
         type: eq_ref
possible_keys: wl_user,namespace_title,wl_user_notificationtimestamp
          key: wl_user
      key_len: 265
          ref: const,wikidatawiki.recentchanges.rc_namespace,wikidatawiki.recentchanges.rc_title
         rows: 1
        Extra: 
*************************** 3. row ***************************
           id: 1
  select_type: PRIMARY
        table: ores_classification
         type: ref
possible_keys: oresc_rev,oresc_winner
          key: oresc_rev
      key_len: 4
          ref: wikidatawiki.recentchanges.rc_this_oldid
         rows: 1
        Extra: Using where
*************************** 4. row ***************************
           id: 1
  select_type: PRIMARY
        table: ores_model
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 2
          ref: wikidatawiki.ores_classification.oresc_model
         rows: 1
        Extra: Using where
*************************** 5. row ***************************
           id: 1
  select_type: PRIMARY
        table: page
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: wikidatawiki.recentchanges.rc_cur_id
         rows: 1
        Extra: 
*************************** 6. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: change_tag
         type: ref
possible_keys: change_tag_rc_tag
          key: change_tag_rc_tag
      key_len: 5
          ref: wikidatawiki.recentchanges.rc_id
         rows: 1
        Extra: Using index

Creating a better query would be a blocker to allowing the deploying of this functionality. Your aim should be on a few dozens of miliseconds, not seconds. I would recommend you checking the code of the other recentchanges filters.

jcrespo moved this task from Triage to Blocked external/Not db team on the DBA board.

Change 294693 had a related patch set uploaded (by Ladsgroup):
Performance boos on hidenondamaging

https://gerrit.wikimedia.org/r/294693

Ladsgroup moved this task from Backlog to Review on the Machine Learning Platform (Current) board.

By using INNER JOIN instead of LEFT JOIN in case of hidenondamaging, everything would tons and tons of times faster.

Change 294693 merged by jenkins-bot:
Performance boost on hidenondamaging

https://gerrit.wikimedia.org/r/294693

Change 294712 had a related patch set uploaded (by Ladsgroup):
Performance boost on hidenondamaging

https://gerrit.wikimedia.org/r/294712

Can you copy me the potential SQL queries generated by your code so I can run EXPLAIN on them on production to test them? Thank you.

EXPLAIN SELECT /* SpecialRecentChanges::doMainQuery Ladsgroup */ 
rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,
rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,
rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest, 
(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,
oresc_probability,0.8 AS ores_threshold 
FROM `recentchanges` 
LEFT JOIN `watchlist` ON (wl_user = '28859' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) 
LEFT JOIN `page` ON ((rc_cur_id=page_id)) 
INNER JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_is_predicted = 1 AND oresc_class = 1))
LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1))
WHERE rc_bot = '0' AND (rc_type != '6') AND (rc_timestamp >= '20160608000000') AND (rc_source != 'wb') AND 
((oresm_name = 'damaging' OR oresm_name IS NULL)) AND (oresc_probability > '0.8') 
AND rc_patrolled = '0' AND rc_new IN ('0','1') 
ORDER BY rc_timestamp DESC 
LIMIT 50\G

That's the query you copied but with my changes (using INNER JOIN instead of LEFT JOIN)

I think the changes look good, 30K-100K rows visited using an index. It could be better, but I think for that I will have to talk to the tags maintainer (not an ORES issue).

Did you check what I briefly mentioned, which is using a specific recentchanges-role slave?

@jcrespo I wrote there, I hope I understood you correctly, We can't get that threshold from DB directly, we need to get user preferences and then determine the threshold based on that value.

No, I am talking about using a particular role when connecting to the database and selecting a DB_SLAVE. I get a 5x better performance when using a recentchanges slave instead of a main traffic one. Maybe your code already does it, but I can't tell.

Oh, this hook is not changing the DB slave role, It only adds condition and stuff, then send it back to mediawiki/core to run the DB query.

Change 294712 merged by jenkins-bot:
Performance boost on hidenondamaging

https://gerrit.wikimedia.org/r/294712

jcrespo closed this task as Resolved.Jun 16 2016, 3:40 PM