Page MenuHomePhabricator

Search result snippets should skip parenthetical phrases (like Google does)
Open, LowestPublicFeature

Description

Compare the following:

Barack Hussein Obama II. (Barack-Hussein-Obama-en-US-pronunciation. ogg | b | ə | ˈ | r | ɑː | k | _ | h | uː | ˈ | s | eɪ | n | _ | oʊ | ˈ | ...

  • vs. -

Barack Hussein Obama II is the 44th and current President of the United States, having taken office in 2009. He is the first African American...

or

The Kingdom of Saudi Arabia. (المملكة العربية السعودية ar | Al Mamlaka al ʻArabiyya as Suʻūdiyya commonly known as Saudi Arabia. us-Saudi Arabia-...

  • vs. -

The Kingdom of Saudi Arabia is, in land area, the third largest Arab country and the largest country in the Middle East. It is bordered by...

The first versions are from our current search results. The second versions are what the search results would look like if we skipped the part in parentheses (like Google seems to do in most cases).


Version: unspecified
Severity: enhancement

Details

Reference
bz28088

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:36 PM
bzimport added a project: CirrusSearch.
bzimport set Reference to bz28088.
bzimport added a subscriber: Unknown Object (MLST).

While we're at it, we should probably remove ref tags as well.

I was going to try to patch this in on truck, but search on my trunk checkout seems to be totally wonky. In particular, it doesn't seem to give beginning snippets for title matches, but rather term-highlighting snippets from the middle of the article (which often ends up just being the interlanguage links). If we're not doing beginning snippets any more, this bug should be marked invalid.

OK, it looks like we have 3 different pieces of software for doing search, and my local install is not set up the same way that the search on the cluster is. It's likely that this bug isn't even filed under the right product and component. If anyone knows more about search, feel free to move it.

Looks like this probably belongs under the Lucene-search extension.

Which, feel free to look at that and take a stab at it if you have any interest in Java. I don't think we have an active maintainer for that.

[Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ]

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Deskana renamed this task from Search result snippets should skip parenthetical phrases to Search result snippets should skip parenthetical phrases (like textextracts does).Dec 31 2015, 3:53 AM
Deskana lowered the priority of this task from Medium to Lowest.
Deskana set Security to None.
Deskana moved this task from Inbox to User interface and experience on the CirrusSearch board.
Deskana moved this task from Needs triage to Search on the Discovery-ARCHIVED board.
kaldari renamed this task from Search result snippets should skip parenthetical phrases (like textextracts does) to Search result snippets should skip parenthetical phrases (like Google does).May 1 2017, 6:46 PM
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM