Page MenuHomePhabricator

Integrate Discourse into Wikimedia wikis search
Closed, DeclinedPublic

Description

Summary

Wikimedia wikis search results only show results from content within the wiki, but not other sources of discussion and documentation. Integrating Wikimedia Space search results would allow Wikimedia wikis users to search and potentially find more information from the Wikimedia movement related to technical and social news and discussion.

Requirements

When a user searches at Special:Search they are presented with the on-wiki search results and in addition to relevant search results from Wikimedia Space. Users can opt-out of viewing the Space search results via some easily accessible method such as via a preference or user profile CSS rule. Wikis can opt-in to include search results. Users can also search Space results by enabling a filter in advanced search.

Search results are indexed in ElasticSearch and are kept up-to-date the same as the on-wiki index. Search results have the proper relevancy (via scoring). The excerpt for each matching result will have the query highlighted in bold. The inclusion of search results can be made configurable in a per-wiki basis.

Feature

Search results from Wikimedia Space should include the following metadata:

  • Title of the topic on Space
  • Contextual excerpt that is returned for the keyword searched
  • Date of publication
  • Category and tags
Examples

Searching for "test" returns:

<Topic title>
<excerpt>
<date of publication> <categories> <tags>

Benefit

Users who are new to the movement may assume that all movement knowledge is kept on-wiki. This is not the case. The inclusion of Wikimedia Space provides more centralized results in what people are looking for. It also provides for more awareness of Wikimedia Space as a venue for news, help, and discussion. Movement organizers can also provide more insight into their discussions, events, and activities if they choose to use Wikimedia Space over alternatives (such as Facebook or Twitter) that can not be indexed.
Acceptance Criteria
MVP of search would show results from Space next to existing results, perhaps with simply the title of the Space topic.

A complete, stable extension can do the above as described with compatibility with all major browsers, functionality with javascript disbled, aria compliant, and with support for RTL languages.

Design

Similar to the "Results from sister projects" sidebar in current Wikimedia search results. Perhaps as a new section below, "Results from Wikimedia Space"

See also

Event Timeline

The current storage format of search is that each mediawiki page is a document. Wrangling external things into this format might technically be possible, but would be a bit of work.

EBjune triaged this task as Lowest priority.Mar 29 2018, 5:09 PM
EBjune moved this task from needs triage to search-icebox on the Discovery-Search board.

As I understand it, Discourse will output .json if you ask it to. And Elasticsearch only cares about json. So wouldn't the work be to create a new index in Elasticsearch that consumes the .json available from Discourse? (I haven't done this, so I don't know the particulars)

As a poor man's substitute we could just add a "search on Discourse" button to the search page (from a default-on gadget, probably).

Sorry, I had missed / forgotten this request when creating a duplicated one. I took the liberty of editing the description to explain possible use cases better.

I think the reference to take is https://www.mediawiki.org/wiki/Cross-wiki_Search_Result_Improvements/Design

I don't know what is the state of the art of displaying related search results, but as a user I think I remember seeing results from other projects in a separate column. The equivalent here would be to see Discourse results in a separate column when showing the results of a query in MediaWiki.org.

[...] I don't know what is the state of the art of displaying related search results, but as a user I think I remember seeing results from other projects in a separate column. The equivalent here would be to see Discourse results in a separate column when showing the results of a query in MediaWiki.org.

Yup. example search on enwiki (works the same in Private window) shows some sister projects in a sidebar.

I think a more fruitful starting place for this sort of thing would be doc.wikimedia.org - its probably simpler as a semi static thing, and i think at these early stages it has more relavent info than discourse does at present (although ideally search would search all the things)

Qgil raised the priority of this task from Lowest to Low.EditedSep 23 2019, 6:34 AM

@Bawolff Do you mean for testing? When it comes to real users and real content, there is not much connection between https://discuss-space.wmflabs.org and https://doc.wikimedia.org.

EDITED: Ah, I see the title and first paragraph still were mentioning MediaWiki.org alone. Maybe this was the point of confusion. We have re-purposed this task to cover any Wikimedia wiki.

Qgil renamed this task from Integrate Discourse into mediawiki.org search to Integrate Discourse into Wikimedia wikis search.Sep 23 2019, 6:39 AM
Qgil updated the task description. (Show Details)

Bawolff's point stands: there are several Wikimedia-related static websites nowadays.

OK, can these other possible search integrations be filed in separate tasks? Each integration has its reasons. I guess some work will be common to all of them, and some work will have to be specific.

The original use case was integrating discourse-mediawiki.wmflabs.org content into mediawiki.org search. That was sort of hypothetical, in the the effort to make discourse-mediawiki a truly active and useful support channel has not been put in yet, and integrating it in search would probably not be worth the effort. Since then the main use case has shifted to integrating discuss-space.wmflabs.org content into some unspecified wiki search...

...which brings me to my main point, there isn't necessarily a one Discourse -> one wiki relationship with Space so this might need further thought. Consider that there is no way to disable sister project searches the way you can disable e.g. searching talk pages, so low-relevance results will be confusing. I think 1) there should be a way to match categories to wikis (e.g. the huwiki Space category should be surfaced on Hungarian Wikipedia but not elsewhere; the Tremendous Wiktionary UG category should only show on wiktionaries); 3) there should be a way to suppress categories (especially events, as there's a lot of them and text-based search is not particularly useful for them); 3) search should only show discussions in the right language (so this might depend on T226721: Multilingual Wikimedia Space getting a little more advanced).

There are no active Discourse instances in Wikimedia currently (discourse-mediawiki.wmflabs.org and Space have been made read-only).
Hence boldly declining this task as there is no use case for Wikimedia currently.

Hence boldly declining this task as there is no use case for Wikimedia currently.

Makes sense, thank you.

However, I'll note that this objective (and several others) would be reached if the discussions were imported into the wikis from Discourse.