Page MenuHomePhabricator

[Story] Search index 10,000 article placeholders on cywiki for testing and evaluation purposes
Closed, ResolvedPublic

Description

As an expansion to T144592, we should increase the number of indexed article placeholders on cywiki ten times and make 10,000 placeholders indexable.

The current very limited trial gives us some hints about the number of spider requests performed (currently we get 100-450 per day). If they increase linearly with the number of indexable placeholders, we would get something between 1,000 and 4,500 requests per day (sum for all cywiki placeholders). Part of the point in increasing the number of indexed placeholders is to see whether this actually happens.

Given this also brings even more focus on the placeholders, there are some tasks we (at least potentially) should fix before going with this.

Details

Related Gerrit Patches:
operations/mediawiki-config : masterIndex article placeholders up to Q16956 on cywiki

Event Timeline

hoo created this task.Apr 5 2017, 8:38 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 5 2017, 8:38 AM
hoo renamed this task from Search index 10,000 article placeholders on cywiki for testing and evaluation purposes to [Story] Search index 10,000 article placeholders on cywiki for testing and evaluation purposes.Apr 5 2017, 8:38 AM
hoo triaged this task as Normal priority.
hoo moved this task from Incoming to To Do Next on the ArticlePlaceholder board.
hoo added subscribers: Lucie, Lydia_Pintscher.
hoo added a comment.May 19 2017, 7:05 PM

We plan to deploy this on Wednesday May 31.

Change 356414 had a related patch set uploaded (by Hoo man; owner: Hoo man):
[operations/mediawiki-config@master] Index article placeholders up to Q16956 on cywiki

https://gerrit.wikimedia.org/r/356414

hoo added a comment.May 31 2017, 4:10 PM
mysql:wikiadmin@db1092 [wikidatawiki]> SELECT page_title FROM page INNER JOIN wb_entity_per_page ON epp_page_id = page_id INNER JOIN page_props AS pp_sl ON pp_sl.pp_page = page_id AND pp_sl.pp_propname = 'wb-sitelinks' INNER JOIN page_props AS pp_st ON pp_st.pp_page = page_id AND pp_st.pp_propname = 'wb-claims' WHERE pp_st.pp_value > 2 AND pp_sl.pp_value > 3 AND NOT EXISTS(SELECT 1 FROM wb_items_per_site WHERE ips_site_id = 'cywiki' AND ips_item_id = epp_entity_id) ORDER BY epp_entity_id ASC LIMIT 10000,1;
+------------+
| page_title |
+------------+
| Q16956     |
+------------+
1 row in set (0.58 sec)

Change 356414 merged by jenkins-bot:
[operations/mediawiki-config@master] Index article placeholders up to Q16956 on cywiki

https://gerrit.wikimedia.org/r/356414

Mentioned in SAL (#wikimedia-operations) [2017-05-31T16:33:39Z] <hoo@tin> Synchronized wmf-config/InitialiseSettings.php: Index article placeholders up to Q16956 on cywiki (T162244) (duration: 00m 42s)

hoo closed this task as Resolved.May 31 2017, 4:35 PM
hoo claimed this task.

All placeholders up to (including) https://cy.wikipedia.org/wiki/Arbennig:Am_y_Pwnc/Q16956 are now indexable!

Lucie awarded a token.Jun 1 2017, 9:14 AM