Maniphest T194185

Implement searching of 'depicts' on commons with the 'inscription' qualifier
Open, LowPublic
Actions

Assigned To

Authored By

	Cparle
	May 8 2018, 5:05 PM

Description

The inscription property refers to ‘inscriptions, markings and signatures on an object' and is of type ‘monolingual text’

When used as a qualifier with 'depicts' it refers to markings on the thing-that-is-depicted - example from wikidata https://www.wikidata.org/wiki/Q1136099 (see depicts > Province of New York > inscription, etc.)

The example I've been using to model this conceptually is band t-shirts - an image could depict a t-shirt with 'The Rolling Stones' written on it, and a user might want to find all images containing pictures of Rolling Stones t-shirts

I can think of 3 different ways of implementing this, all with drawbacks/tradeoffs

Option 1
We can store the qualifier in the normal way like this P180=Q131151[P1682=The Rolling Stones] (see T193407), in which case we would only be able to find exact matches. For example searching for haswbstatement:P180=Q131151[P1682=Stones] won't match P180=Q131151[P1682=The Rolling Stones].

Option 1a
Perhaps we could pass a regex to the haswbstatement keyword? Would require changes to the mapping of the statement_keywords field

Option 2
Implement a specific elasticsearch solution just for this qualifier - for example we could store the inscription in a fulltext field, which would mean a partial match would work. It'd be tricky to do, because we'd need to treat one qualifier differently to all the others both when we were indexing and when we're searching. Also if we did it this way I'm not sure how to store the fact that the inscription relates to a particular 'depicts' tag (or even if that'd be possible) - so someone could try and search for pictures containing Rolling Stones t-shirts and some of their results would contains blank t-shirts plus some other object with the text 'The Rolling Stones' inscribed on it.

Option 3
Another possible approach is to use the Wikidata Query Service (WDQS) to run a SPARQL query, and then use the ids as a filter for an elasticsearch query - basically we'd ask WDQS for all pictures depicting a t-shirt inscribed with 'The Rolling Stones', take all the resulting IDs, and then search elasticsearch for anything else we wanted to search for but only among the (max 1000) IDs we got from WDQS.

Note that this option depends on T194401

Option 1 is easiest, but only does exact matches unless the regex idea (option 1a) works, which might be difficult to implement on the frontend in a user-friendly way
Option 2 is tricky to implement, and may return some incorrect data, but would probably be more performant than option 3
Option 3 is in-between, implementation-wise. Probably the slowest to run, results will be more accurate than option 2 but because of limitations passing data between WDQS and elasticsearch there will be edge cases where no results will be returned even if appropriate results exist.. This option depends on T194401

Wikidata currently contains 7 items with depicts statements that have inscription qualifiers out of a total of ~70k items with depicts statements (~0.01%)

Related Objects
Search...

Status	Subtype	Assigned	Task
Declined		dchen	T118706 Conduct heuristic evaluation of image upload and insert flow in VisualEditor
Open		None	T115858 Design improvements for mw.ForeignStructuredUpload.BookletLayout
Open		None	T115865 Insert image in content immediately after it's uploaded, skipping the "General settings" step
Duplicate		None	T115864 Figure out if the description of the image can be used as the caption on-wiki
Open	Feature	None	T53032 When inserting an image, set its caption by default to be the Commons image description
Open	Feature	None	T39534 Wikimedia Commons should support searching by color
Duplicate		None	T39535 Wikimedia Commons should support filtering by color
Resolved		None	T19503 Provide metadata support on Wikimedia Commons
Resolved		None	T51662 VisualEditor: Use Multimedia/Wikidata's proposed rich structured meta-data in the image insertion dialog
Resolved		None	T68108 [Epic] Store media information for files on Wikimedia Commons as structured data
Resolved		• Ramsey-WMF	T199352 Deploy Structured Data on Commons with arbitrary Statements
Resolved		None	T215305 "Depicts and other statements on a bicycle": Qualifiers, and search by depicts statements, and other statements
Resolved		Cparle	T191633 Implement searching of 'depicts' on commons
Open		Cparle	T194185 Implement searching of 'depicts' on commons with the 'inscription' qualifier

Event Timeline

Cparle triaged this task as Medium priority.May 8 2018, 5:05 PM

Cparle created this task.

Cparle updated the task description. (Show Details)May 8 2018, 5:19 PM

EBernhardson moved this task from needs triage to watching / waiting on the Discovery-Search board.May 8 2018, 7:25 PM

Cparle updated the task description. (Show Details)May 9 2018, 9:00 AM

Cparle updated the task description. (Show Details)May 9 2018, 10:09 AM

Cparle mentioned this in T194255: Implement searching of 'depicts' on commons with the 'relative position within image' qualifier.May 9 2018, 11:52 AM

Cparle updated the task description. (Show Details)May 10 2018, 2:29 PM

Cparle mentioned this in T194401: Investigate storing commons data in BlazeGraph.May 10 2018, 3:08 PM

Cparle updated the task description. (Show Details)

Cparle updated the task description. (Show Details)May 10 2018, 3:11 PM

Cparle updated the task description. (Show Details)

Cparle updated the task description. (Show Details)May 10 2018, 3:13 PM

Cparle updated the task description. (Show Details)May 11 2018, 11:13 AM

Cparle updated the task description. (Show Details)May 11 2018, 11:21 AM

• Ramsey-WMF moved this task from Untriaged to Product owner backlog on the Multimedia board.May 17 2018, 4:57 PM

• Vvjjkkii renamed this task from Implement searching of 'depicts' on commons with the 'inscription' qualifier to addaaaaaaa.Jul 1 2018, 1:11 AM

• Vvjjkkii removed Cparle as the assignee of this task.

• Vvjjkkii raised the priority of this task from Medium to High.

• Vvjjkkii added projects: CheckUser, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), Tamil-Sites, Gamepress, Hashtags, Jade, KartoEditor, Language-2018-Apr-June, New-Editor-Experiences, Mail, TCB-Team (now WMDE-TechWish).

• Vvjjkkii updated the task description. (Show Details)

• Vvjjkkii removed a subscriber: Aklapper.

CommunityTechBot renamed this task from addaaaaaaa to Implement searching of 'depicts' on commons with the 'inscription' qualifier.Jul 2 2018, 6:14 AM

CommunityTechBot assigned this task to Cparle.

CommunityTechBot lowered the priority of this task from High to Medium.

CommunityTechBot updated the task description. (Show Details)

CommunityTechBot removed projects: TCB-Team (now WMDE-TechWish), Mail, New-Editor-Experiences, Language-2018-Apr-June, KartoEditor, Jade, Hashtags, Gamepress, Tamil-Sites, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), CheckUser.

CommunityTechBot added a subscriber: Aklapper.

Addshore moved this task from incoming to monitoring on the Wikidata board.Sep 19 2018, 7:09 AM

Cparle lowered the priority of this task from Medium to Low.Nov 15 2018, 5:17 PM

MarkTraceur added a project: Structured Data Engineering (Depicts and other statements on a bicycle).Mar 1 2019, 4:12 PM

• Ramsey-WMF edited projects, added Structured Data Engineering; removed Structured Data Engineering (Depicts and other statements on a bicycle), Multimedia-Team-Working-Board, SDC General, Wikidata.Aug 8 2019, 11:41 PM

Aklapper edited projects, added Structured-Data-Backlog; removed Multimedia.Aug 10 2019, 11:55 PM

Restricted Application added a project: Multimedia. · View Herald TranscriptAug 10 2019, 11:55 PM

• Ramsey-WMF moved this task from Triage to Desired epics on the Structured-Data-Backlog board.Aug 13 2019, 7:30 PM

• Ramsey-WMF removed a project: Multimedia.Aug 14 2019, 5:54 PM

CBogen moved this task from Desired epics to Triage on the Structured-Data-Backlog board.Aug 25 2020, 4:34 PM

CBogen moved this task from Triage to SDoC Statements on the Structured-Data-Backlog board.Aug 25 2020, 4:36 PM

Closing out low/est priority tasks over 6 months old with no activity within last 6 months in order to clean out the backlog of tickets we will not be addressing in the near term. Please feel free to reopen if you think a ticket is important, but bare in mind that given current priorities and resourcing, it is unlikely for the Search team to pick up these tasks for the indefinite future. We hope that the requested changes have either been addressed by or made irrelevant by work the team has done or is doing -- e.g. upgrading Elasticsearch to a newer version will solve various ES-related problems -- or will be subsumed by future work in a more generalized way.

Re-opening tasks and removing from team workboard per IRC feedback given yesterday and discussion with MPham.

Implement searching of 'depicts' on commons with the 'inscription' qualifierOpen, LowPublicActions

Description

Related ObjectsSearch...

Event Timeline

Implement searching of 'depicts' on commons with the 'inscription' qualifier
Open, LowPublic
Actions

Related Objects
Search...