Page MenuHomePhabricator

Faceted search for Commons location of creation
Open, Needs TriagePublic

Description

Faceted search is a technique which involves augmenting traditional search techniques with a faceted navigation system, allowing users to narrow down search results by applying multiple filters based on faceted classification of the items. A faceted classification system classifies each information element along multiple explicit dimensions, called facets, enabling the classifications to be accessed and ordered in multiple ways rather than in a single, pre-determined, taxonomic order.

More than 800.000 files currently have the location of creation (P1071) on Commons, see https://commons.wikimedia.org/w/index.php?search=haswbstatement%3AP1071&title=Special:Search&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns0=1&ns6=1&ns9=1&ns12=1&ns14=1&ns100=1&ns106=1 . We should start using this data to provide faceted search by location. So if I search as user for something (for example "church"), I get the option to drill down the results based on the location.

Location is quite hierarchical with some quirks. World -> Europe -> Netherlands -> Noord-Holland -> Bloemendaal for example for https://commons.wikimedia.org/wiki/File:%27t_Kopje_van_Bloemendaal_fake_mailbox.jpg
This tree can be based on a query on Wikidata, hard coded, in a configuration page in the MediaWiki namespace or a combination. Being able to configure it in the MediaWiki namespace combined with some tree query on Wikidata is probably the most scalable way.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We now have more than 1 million files with location of creation on Commons, see https://commons.wikimedia.org/w/index.php?search=haswbstatement%3AP1071&title=Special%3ASearch

@CBogen why did you remove the search projects? This looks very search to me.

@CBogen why did you remove the search projects? This looks very search to me.

The Discovery-Search project is mainly for backend search work; this is a front-end project in MediaSearch on Commons so it would be part of the Structured Data team's purview.

@CBogen why did you remove the search projects? This looks very search to me.

The Discovery-Search project is mainly for backend search work; this is a front-end project in MediaSearch on Commons so it would be part of the Structured Data team's purview.

Without backend work I don't think faceted search will work or you end up with a severely handicapped version of it.

Created overarching task for faceted search. I would really like to see that implemented and it would be great to get a conversation going: T337106: Faceted, structured data-based MediaSearch on Wikimedia Commons