Page MenuHomePhabricator

Change to SpecialSearchResults hook breaks extension
Closed, ResolvedPublic

Description

Since the developer didn't respond in [0, 1] as the when, by whom the changes where made to the SpecialSearchResults hook, I'll reiterate the issue in this task:

[2, 3] is expecting the ref-by to modify/augment results after they have been fetched in a postFilter process.

  • If you need to change the hook signature then please follow [4].
  • If you think those ref-by are inappropriate then please provide examples on how to achieve the same functionality and not just drop a bomb like "... no reason to use by-ref on objects unless the hook is supposed to replace them (this one isn't)".

PS: For an external extension maintainer (like myself) it is utterly annoying that changes are made without considering the wider audience (that are not WMF affiliate).

[0] https://gerrit.wikimedia.org/r/#/c/332667/
[1] https://github.com/wikimedia/mediawiki/blob/c43655edc9bb71819325e4bdd2cfe19f04009b19/includes/specials/SpecialSearch.php#L374
[2] https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/blob/master/src/HookRegistry.php#L321-L332
[3] https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/blob/master/src/Search/SearchResultModifier.php#L145-L171
[4] https://www.mediawiki.org/wiki/Deprecation

Event Timeline

mwjames created this task.Jan 18 2017, 11:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 18 2017, 11:22 AM
Legoktm added a subscriber: Legoktm.

The change that modified the hook signature should be reverted.

Restricted Application added a project: Discovery. · View Herald TranscriptJan 18 2017, 8:08 PM

I don't think just replacing search result set object would really work. It will probably break a bunch of stuff. Mediawiki has/had a bunch of by-ref hooks that aren't really supposed to be by-ref, as I remember, and I'm not sure what would be the use case for just throwing out a specific result set and replacing it with another.

That said, we can restore the reference if it breaks things. @EBernhardson, would like to hear your comments on that.

Regardless of whether or not extensions should be able to fully replace an object, removing & from hook calls breaks all extensions that have the & in their definition, and needs to follow the deprecation policy: https://www.mediawiki.org/wiki/Deprecation.

@Legoktm I'm not sure how it is possible to do. I.e., how exactly could we make deprecation notices there?

As you can see in [1], quite a few things have already operated on $titleMatches and $textMatches by the time that hook is called, changing them out in the middle of the function will lead to a wide variety of unexpected and hard to debug issues. What is the goal of SMW here? I'm sure we can find a viable solution.

In the past it's been done by deprecating the old hook and creating a new one with a different signature that doesn't include &. In any case, if you're going to make a breaking change like this, it needs to be announced per the policy and have proper justification - both of which are missing.

Change 332884 had a related patch set uploaded (by Chad):
Unbreak extensions using SpecialSearchResults hook

https://gerrit.wikimedia.org/r/332884

demon added a subscriber: demon.Jan 18 2017, 10:27 PM

Both this hook and ShowSearchHitTitle (see gerrit 332878) are being partially reverted to restore their references per Kunal. This not only broke third party extensions, but extensions we have deployed currently (which is unacceptable--cf T155668, T155667)

Even if an extension isn't replacing it, mismatched signatures caused spurious warnings that are no fault of the third party developer.

Yes, references are evil, but hooks cannot be changed on a whim.

demon triaged this task as Unbreak Now! priority.Jan 18 2017, 10:32 PM
Restricted Application added subscribers: Jay8g, TerraCodes. · View Herald TranscriptJan 18 2017, 10:32 PM

Change 332884 merged by jenkins-bot:
Unbreak extensions using SpecialSearchResults hook

https://gerrit.wikimedia.org/r/332884

demon closed this task as Resolved.Jan 18 2017, 10:45 PM
demon claimed this task.

Fixed in master. Production bugs will follow-up in the tasked I linked in the previous comment.

In the past it's been done by deprecating the old hook and creating a new one with a different signature that doesn't include &.

Not sure how that works. Should we not call the old hook anymore? That would instantly break all extensions using it. Or should we call both old and new one? But that might cause weird side effects on double-hooking... Is there any example in the code of such hooks?

mwjames added a comment.EditedJan 19 2017, 4:39 AM

changing them out in the middle of the function will lead to a wide variety of unexpected and hard to debug issues.

I'm fully aware of that but the hook is the best we got as a means to modify a result set after it has been fetched and before it is displayed to a user.

What is the goal of SMW here? I'm sure we can find a viable solution.

In this particular case of the cited extension:

  • On the search page we provide a language dropdown where a user can choose a content language [0, 1]
  • After the user has entered the search term and started the search, we inspect the returning result set (which can come from Cirrus or MySQL and where &$titleMatches, &$textMatchecomes into play) and compare the titles that were found with SMW's stored language annotation and filter those that don't match the user input

This process allows us to use the power of the search backend (without having to know anything about it) and still provide access to annotations by applying a postFilter process before it is displayed to the user. [0, 1] demonstrates how the diminution currently works with the search term Lorem to return different results depending on the language selected because of the available annotations (or missing ones) for the initial set of results.

[0] https://sandbox.semantic-mediawiki.org/w/index.php?title=Sp%C3%A9cial%3ARecherche&profile=sil&fulltext=Search&search=Lorem&profile=sil&languagefilter=fr
[1] https://sandbox.semantic-mediawiki.org/w/index.php?title=Sp%C3%A9cial%3ARecherche&profile=sil&fulltext=Search&search=Lorem&profile=sil&languagefilter=en

Should we not call the old hook anymore? That would instantly break all extensions using it. Or should we call both old and new one? But that might cause weird side effects on double-hooking... Is there any example in the code of such hooks?

All the discussion whether using ref-by is a good or bad practise does not alter the fact that you modified the signature of a hook and have it potentially break systems that rely on its signature (luckily our integration tests broke before that), this is just not best practice.

compare the titles that were found with SMW's stored language annotation and filter those that don't match the user input

This will probably break total counts (since those are calculated from original results before the hooks runs). This also may break interwiki results as original result sets may have contained ones (and the check is done before the hook is run) but MappedSearchResultSet doesn't support them as far as I can see.

If you really want to manipulate SearchResultSet, it probably would be better to use the original object than to create a new one. Maybe we can add some API for filtering results? Though I think it may be too late on SpecialSearchResults to modify the set object... Maybe instead we need a better hook for just filtering results - I thought we have one but couldn't find one.

Another option might be to plug in early - e.g. by decorating SearchEngine class?

I also don't like hooking SpecialSearchResults in this way since it means API results and GUI results can be completely different. As far as I can see, there is no modification of API search result, which means any tool using the API gets results that do not match what you see in Special:Search.