Page MenuHomePhabricator

Index (for search) sighted revisions only on de-wp
Open, LowestPublicFeature

Description

Author: gnu1742

Description:
Please evaluate the possibility that the search engine indexes sighted revisions only. This is to prevent vandalism to be displayed at a prominent place.

Background: After the elections in Israel a few days ago the articles about the candidates like Benjamin Netanjahu are both interesting for readers and target to vandals.
This article was vandalised on the 12th february at 04:29 CET in a horrible way (think of Hitler and the Shoah and you'll guess what i mean). The vandalism was undone about 15 minutes later and the revision was deleted in the meantime. Unfortunately the daily search-indexing of de-wp took place during that quarter of an hour, so the vandalism showed up every time someone searched for Netanjahu. When i was told about this i asked at #wikimedia-tech to manually re-index de-wp.

Every article on de-wp has a sighted revision by now, so this solution would prevent offenses like the one described above.


Version: unspecified
Severity: enhancement
URL: http://de.wikipedia.org/wiki/Spezial:Suche?search=Benjamin+Netanjahu&fulltext=1

Details

Reference
bz17475

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:28 PM
bzimport added a project: CirrusSearch.
bzimport set Reference to bz17475.

(In reply to comment #1)

Isn't this already done?

Nevermind, missed the term "search engine" :)

rainman wrote:

Index is updated using articles from OAI repository and/or xml dumps. AFAIK, OAI doesn't know distinction between sighted or not. Not sure if dumpBackup has a switch to get sighted versions only?

gnu1742 wrote:

Well, that is the reason why i started with 'please evaluate the possibility...'... Anyway: What is OAI?

Edits call SearchUpdate(), so normally a revert would update the search index...does the lucene engine head those calls?

rainman wrote:

No, lucene backend periodically fetches new changes from the OAI repository (which contains all the changes made to the wiki). AFAIK, SearchUpdate is only used in built-in MW search.

gnu1742 wrote:

Has anything been done about this issue? It's the bugs 1st birthday in a few days ;-)

[Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ]

This looks possible, though definitely low priority, with the new CirrusSearch: as it uses some queue system, I suppose the index update could be delayed until the revision is approved. (May need to be moved to FlaggedRevs component if the search system already allows such modifications.)

This will be significantly mitigated with CirrusSearch because it updates articles right after they are changed. I'm not sure how it interacts with FlaggedRevs at the moment but I'll be sure to have a look at some point in the future.

Change 104675 had a related patch set uploaded by Chad:
Support FlaggedRevs

https://gerrit.wikimedia.org/r/104675

Change 104675 abandoned by Chad:
Support FlaggedRevs

Reason:
With the latest implementation in Iff0bf5d5 this isn't needed anymore. We'll add a hook to FlaggedRevs to support it.

https://gerrit.wikimedia.org/r/104675

demon lowered the priority of this task from Medium to Lowest.
demon set Security to None.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM
Aklapper removed a subscriber: Manybubbles.