Page MenuHomePhabricator

Adding "search inside" feature to wikisource
Open, Needs TriagePublic

Description

A search box for books and sources on wikisource to enable readers from jumping to the parts they want using key words or phrases.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 29 2017, 10:58 PM

To add a search form to a work, you can use the {{engine}} template, or add an inputbox more generally.

Restricted Application added projects: Discovery, Discovery-Search. · View Herald TranscriptJan 30 2017, 11:35 AM

As Sam mentioned {{engine}} is really useful and utilised in many places in English Wikisource. It works well for searching inside (drilling down) a root pages/work set out in subpages. Works or numbers of our Wikisource: ns pages available as examples. It works well on an Index: page to search Page: namespaces (check the index pages for the four volumes of Alumni Oxonienses) and we haven't overly had to customise it with special components from Cirrus Search.

If you can be more specific for the types of searches that you are looking to utilise, then we can give more specific advice.

There's not much for Discovery-Search to do here. The search engine supports this feature. An inputbox with the appropriate prefix should solve the problem, as @Samwilson said. So, I'm removing Discovery-Search from this.

Thanks @Deskana.

There seems to be something of an odd thing when searching pages that use the <pages /> tag when the search string is found in the page title.

For example, this search finds [[Popular Science Monthly/Volume 70/February 1907/A Vocabulary Test]] but the extract just shows the text from the pages tag:
test prefix:Popular Science Monthly
https://en.wikisource.org/wiki/Special:Search?search=test&prefix=Popular+Science+Monthly&fulltext=Search+PopSciMon&fulltext=Search&searchToken=ab6s4as745mjqn4vsrqng7d3q

Whereas this test finds the same page, but this time correctly shows the excerpt from the page (i.e. transcluded content):
"preliminary tests" prefix:Popular Science Monthly
https://en.wikisource.org/w/index.php?title=Special:Search&profile=default&fulltext=1&search=%22preliminary+tests%22+prefix%3APopular+Science+Monthly&searchToken=bjnyo7bivlgmhq1f5pwm48zi3

@Samwilson I think this is a general issue with how the extract is chosen. A search for "kirkpatrick prefix:Popular Science Monthly" also appears to have the same problem as your first query, because that's also included in the metadata. Generally, I think the extract should only show what a user would consider to be content, rather than metadata like this.

I'm very unfamiliar with Wikisource, so please correct me if I'm wrong or if I've misunderstood the point you were making.

@Samwilson I think this is a general issue with how the extract is chosen. A search for "kirkpatrick prefix:Popular Science Monthly" also appears to have the same problem as your first query

Here's the search I did, for convenience: https://en.wikisource.org/w/index.php?title=Special:Search&profile=default&fulltext=1&search=kirkpatrick+prefix%3APopular+Science+Monthly

@Deskana: yes, that's the same problem. I guess it searches titles and if it finds a match there just retrieves the part of body text that also contains that text? And doesn't expand templates in the body text? I'd say it's a but with proofreadpage (and maybe it is) because it does work correctly when the search term isn't in the title. I'm not at all familiar with how the extract is chosen in the different cases.

This comment was removed by Billinghurst.

I suspect this would also be useful for Wikibooks.

{{engine}} can be useful, but I doubt it's ideal from a UI perspective.

  • The search function should be accessible from the regular search box itself. Either a checkbox, or a set of options (like in Github) for selecting either "Search Wikisource" or "Search <work>".
  • Exposing the "prefix:" code to the user is also confusing. It should probably be its own marker on the search page, with no confusing code exposed.