Page MenuHomePhabricator

No replacement for drunique in deprecated module deletedrevs
Closed, ResolvedPublic

Description

The flag makes the query return one revision per deleted title. This functionality is missing from the replacement module alldeletedrevisions.

Details

Related Gerrit Patches:

Event Timeline

MER-C created this task.Aug 29 2015, 9:34 AM
MER-C raised the priority of this task from to Needs Triage.
MER-C updated the task description. (Show Details)
MER-C added a project: MediaWiki-API.
MER-C added a subscriber: MER-C.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 29 2015, 9:34 AM
Anomie added a subscriber: Anomie.

The way it was being done in the old module involves a MySQL misfeature. What exactly is the use case here, to help in planning a replacement?

MER-C added a comment.Sep 1 2015, 4:10 AM

To search the archive table for titles with a given prefix without having to iterate through the entire list of deleted revisions -- essentially like https://en.wikipedia.org/w/index.php?title=Special%3AUndelete&prefix=John . The current situation is not desirable if you hit a page with 5000 deleted revisions.

Anomie added a comment.Sep 1 2015, 7:29 PM

Would it be satisfactory if this only worked in generator mode and didn't work with 'gadruser', i.e. we just fix gadrgeneratetitles=1 to not internally process all the revisions? That looks like the best bet for a sane query.

Anomie added a subscriber: jcrespo.Sep 4 2015, 8:38 PM

It turns out I need to add one more restriction (unless @jcrespo has ideas?): you also need to pass gadrdir=newer.

Sample queries that seem to work:

SELECT DISTINCT ar_title,ar_namespace  FROM `archive`    ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE (ar_title LIKE 'Foo%' )  ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE ar_namespace = '0'  ORDER BY ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE ar_namespace IN ('0','1','2','3')   ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive` INNER JOIN `change_tag` ON ((ar_rev_id=ct_rev_id))  WHERE ct_tag = 'Addition of interwiki link'  ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive` INNER JOIN `change_tag` ON ((ar_rev_id=ct_rev_id))  WHERE ct_tag = 'Addition of interwiki link' AND ar_namespace IN ('0','1','2','3')   ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE (ar_namespace > 8 OR (ar_namespace = 8 AND ar_title >= 'Gadgets-definition'))  ORDER BY ar_namespace ,ar_title  LIMIT 11
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE (ar_title LIKE 'Foo%' ) AND (ar_namespace > 8 OR (ar_namespace = 8 AND ar_title >= 'Gadgets-definition'))  ORDER BY ar_namespace ,ar_title  LIMIT 11

If you try to do it with gadrdir=older, MySQL decides it has to filesort on the continuation queries for some reason:

SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE (ar_namespace < 8 OR (ar_namespace = 8 AND ar_title <= 'Gadgets-definition'))  ORDER BY ar_namespace DESC,ar_title DESC LIMIT 11;
SELECT DISTINCT ar_title,ar_namespace  FROM `archive`   WHERE (ar_title LIKE 'Foo%' ) AND (ar_namespace < 8 OR (ar_namespace = 8 AND ar_title <= 'Gadgets-definition'))  ORDER BY ar_namespace DESC,ar_title DESC LIMIT 11;

Change 236190 had a related patch set uploaded (by Anomie):
ApiQueryDeletedRevisions: Optimize ascenting title generation

https://gerrit.wikimedia.org/r/236190

Change 236190 merged by jenkins-bot:
ApiQueryDeletedRevisions: Optimize ascending title generation

https://gerrit.wikimedia.org/r/236190

Anomie closed this task as Resolved.Sep 10 2015, 2:26 PM
Anomie claimed this task.

Should be deployed to WMF wikis with 1.26wmf23, see https://www.mediawiki.org/wiki/MediaWiki_1.26/Roadmap for the schedule.