Page MenuHomePhabricator

Document the resolved redirect fix so consumers can update their code
Closed, ResolvedPublic

Description

In T92796, we fixed it so that resolved redirects have indices. We need to write some documentation about how the fix changes the API output, so that consumers can update as appropriate.

Event Timeline

Deskana raised the priority of this task from to Medium.
Deskana updated the task description. (Show Details)
Deskana subscribed.

I'm not really sure where the documentation goes, here is a reasonable first step i guess:


Sample API response:

{
    "batchcomplete": "",
    "query": {
        "redirects": [
            {
                "index": 1,
                "from": "Football (association football)",
                "to": "Ball (association football)"
            }
        ],
        "pages": {
            "-1": {
                "ns": 0,
                "title": "Ball (association football)",
                "missing": ""
            },
            "11480": {
                "pageid": 11480,
                "ns": 0,
                "title": "Foo Prefix Test",
                "index": 2
            }
        }
    }
}

When iterating the "pages" object not all contained objects will have a search position (labeled 'index'). These pages will be represented (possibly multiple times) within the "redirects" object. One simple example of resolving indexes in javascript:

$.each( x.query.pages, function (k,v) {
    if (v.index === undefined) {
        var chosen = x.redirects.filter( function (x) { return x.to == v.title; } ).map( function (x) { return x.index } );
        if ( chosen.length ) {
            v.index = Math.min.apply( undefined, chosen );
        }
    }
})

Thanks for the documentation, Erik!

I've asked the primary customer of this API change, @Jdforrester-WMF, to tell us here whether it works for him. I've also pinged some of the Mobile Apps Team to provide a response too, as they're heavy search API consumers.

Having read through T92796, I'd like to request we take another stab at this.

Specifically, the title of T92796 was...

Prefix search API doesn't return "index" field when "redirects" is enabled.

The merged work-around doesn't actually fix this. And while I can appreciate the merged work-around, I think it results in increased rather than decreased complexity. Additionally, every consumer will have to contend with this complexity, so we get a multiplier effect.

Could we have one more go at this?

@Mhurd @Anomie specifically rejected what you are asking for, you'd have to convince him it should be the other way.

I'll basically second what @Mhurd said.
While this technically provides a (much appreciated) solution by including the index in the redirects array, it's more of a workaround, and will require clients to add more code, and increase overall complexity unnecessarily.

If you simply put the index field in the pages array (where all the other unredirected indexes are), then none of the existing clients would have to update any code at all. It would just work.
I would strongly maintain my original request that it be done this way.

If you simply put the index field in the pages array (where all the other unredirected indexes are), then none of the existing clients would have to update any code at all. It would just work.

Without actually answering all the questions in T92796#1121265, continuing to complain about it is going to accomplish nothing useful.

@Anomie Here we go:

What index should be returned for the redirect target?

It should be the same index that would have been returned for the unredirected item.

Is that necessarily valid when the redirect target wasn't actually returned as the result of the prefixsearch?

That's precisely what we're hoping to get by resolving redirects -- titles that weren't returned as the result of prefixsearch.

What if both the redirect and the target page are returned in the same result set?

We (the clients) already combine duplicate result items; this wouldn't present an issue. (In certain cases we perform a prefix search and a full-text search simultaneously, and combine any duplicate results from the two)

What if multiple redirects were returned pointing to the same result page? Which one wins?

Since we already combine duplicate result items into one, we would show only the single result page to which any multiple redirects resolve.

Would all end-users agree with you on the answers to all of these questions, or is it limited to your particular use case? What about for uses of this extra-data feature besides prefixsearch, would the answers still make sense?

I'm sure not all end-users would agree, but I think most would. If I search for "obama", it's much more useful if the first result is "Barack Obama", complete with thumbnail and description, instead of the redirect "Obama", without either of those things. It's also less useful when the first result is "I Got a Crush on Obama", which happens when we don't know how to order the redirected titles. Can you clarify what you mean regarding uses of this feature besides prefixsearch?

On review, I'm not sure T92796 is actually fixed… Though this documentation seems to cover how to use the work-around (thanks), the issue for us was that the API response wasn't suitable to feed into a widget, and I still think that it's the case now. Adding technical debt to every client is a poor solution.

Please also keep in mind that whatever solutions you are proposing for integrating the redirect indexes, this is *NOT* search specific code. This is top level generic mediawiki code for taking the results of one api action and feeding it into another action. The answers to anomie's questions need to be relevant not only for search but for every possible API action that can and is performed in mediawiki.

I don't have a better solution, and think adding technical debt to every client is the best option we have as long as you want to continue using the API's generic handling for feeding the result of one api action into another.

@Anomie Here we go:

What if both the redirect and the target page are returned in the same result set?

We (the clients) already combine duplicate result items; this wouldn't present an issue. (In certain cases we perform a prefix search and a full-text search simultaneously, and combine any duplicate results from the two)

You didn't really answer the question. How does it generically choose which index to keep?

What if multiple redirects were returned pointing to the same result page? Which one wins?

Since we already combine duplicate result items into one, we would show only the single result page to which any multiple redirects resolve.

Same.

Can you clarify what you mean regarding uses of this feature besides prefixsearch?

Because this feature isn't "add the search index from the generator", it's "add arbitrary data from the generator". So far all your answers were extremely specific to the search usecase.

ksmith subscribed.

This documentation work has been done. The actual coding work, which had been marked as done and resolved, is being re-opened, pending further discussions.