Page MenuHomePhabricator

Track Count grouped by Number of labels, descriptions & aliases per item Wikidata
Closed, DeclinedPublic

Description

  • Count grouped by Number of labels, descriptions & aliases per item
    • Should be possible through the wb_terms table

Event Timeline

Addshore claimed this task.
Addshore raised the priority of this task from to Needs Triage.
Addshore updated the task description. (Show Details)
Addshore added subscribers: StudiesWorld, Addshore, Aklapper.

Although this should be possible through the wb_terms table this is going to end up being a VERY slow query

It could all be calculated using:

SELECT
	COUNT(*) AS count,
	term_entity_id
	term_entity_type,
	term_type,
	term_language
FROM wikidatawiki.wb_terms
GROUP BY term_entity_type, term_type, term_language, term_entity_id;
Addshore set Security to None.
SELECT
	a.terms AS terms,
	a.type AS type,
	COUNT(*) AS count
FROM (
	SELECT
		term_entity_id AS entity,
		term_type AS type,
		COUNT(*) AS terms
	FROM wikidatawiki.wb_terms
	WHERE term_type IN ( 'label', 'description' )
	GROUP BY term_entity_id, term_type
) AS a
GROUP BY a.terms, a.type

Takes an hour and 54 mins >.>

there is no index on term_type ;(

Addshore claimed this task.

Okay, right now I am going to mark this as declined / not going to do.
The value of this right now is unclear.
If we try to move forward with this particular task again using the dumps would be the only way forward!