Expose 'redirect chain is longer than $wgMaxRedirects' to API
Open, Needs TriagePublic

Description

Ideally before $wgMaxRedirects is set to > 1 on wikimedia sites (i.e. T65388: set $wgMaxRedirects = 2 on dewiki and T67064: Increase $wgMaxRedirects on enwiki), the API should somehow indicate when a title has a redirect chain which is longer than $wgMaxRedirects.

This might be done by simply exposing the value of $wgMaxRedirects in siteinfo, allowing the client to determine if the redirect chain is too long.
And/Or an API warning when the redirect chain is too long for a title queried.
And/Or prop=info includes a flag in the "redirects" block to indicate which redirect target the server will actually be loaded.

jayvdb created this task.Jul 14 2015, 4:58 PM
jayvdb updated the task description. (Show Details)
jayvdb raised the priority of this task from to Needs Triage.
jayvdb added subscribers: jayvdb, XZise.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 14 2015, 4:58 PM
Xqt added a subscriber: Xqt.Jul 14 2015, 6:03 PM
Anomie added a subscriber: Anomie.Jul 15 2015, 1:05 AM

This might be done by simply exposing the value of $wgMaxRedirects in siteinfo

That would be easy to do.

And/Or an API warning when the redirect chain is too long for a title queried.

It's unlikely the API would actually know, since it just uses the data that's stored in the redirect table. So someone would have to update the database schema to include that information in said table.

And/Or prop=info includes a flag in the "redirects" block to indicate which redirect target the server will actually be loaded.

prop=info doesn't have a "'redirects' block". It does have a boolean to indicate whether the page is a redirect (reflecting the page_is_redirect database field), but there's no indication as to what the target of the redirect might be. And if T31115: add redirect target value on page info (ApiQueryInfo) ever does get done, it's not going to report the whole chain. It'll use the redirect table which only stores the final target.

If something to report the whole chain does get made, it would have to have a very low maximum limit.

This might be done by simply exposing the value of $wgMaxRedirects in siteinfo

That would be easy to do.

Happy with that approach..

And/Or an API warning when the redirect chain is too long for a title queried.

It's unlikely the API would actually know, since it just uses the data that's stored in the redirect table. So someone would have to update the database schema to include that information in said table.

And/Or prop=info includes a flag in the "redirects" block to indicate which redirect target the server will actually be loaded.

prop=info doesn't have a "'redirects' block".

sorry, I could have been more clear by saying the pageset module, or whatever it is now.

It does have a boolean to indicate whether the page is a redirect (reflecting the page_is_redirect database field), but there's no indication as to what the target of the redirect might be. And if T31115: add redirect target value on page info (ApiQueryInfo) ever does get done, it's not going to report the whole chain. It'll use the redirect table which only stores the final target.

If something to report the whole chain does get made, it would have to have a very low maximum limit.

Doesnt this report the whole chain?

http://communitytest.wikia.com/api.php?action=query&titles=User:Jayvdb/R1&redirects=
https://en.wikipedia.org/w/api.php?action=query&titles=User:Legoktm/R1&redirects=

Anomie moved this task from Unsorted to Needs Code on the MediaWiki-API board.Jul 15 2015, 2:08 PM

Not exactly. Apparently ApiPageSet's redirects parameter actively doesn't honor $wgMaxRedirects.

Say you have R1 → R2 → R3 → R4 → Target, with $wgMaxRedirects = 2. Assuming the database links tables were properly updated, ?action=query&titles=R1&redirects=1 would claim a chain of R1 → R3 → Target: it queries the final target of R1 getting R3, then of R3 getting Target.

Fetching the real chain would involve loading the content of every page in the chain to extract the redirect target from it, since we don't actually store the intermediate points in the chain anywhere. Or else changing how we store redirects.

jayvdb added a comment.EditedJul 15 2015, 4:06 PM

Not exactly. Apparently ApiPageSet's redirects parameter actively doesn't honor $wgMaxRedirects.

Say you have R1 → R2 → R3 → R4 → Target, with $wgMaxRedirects = 2. Assuming the database links tables were properly updated, ?action=query&titles=R1&redirects=1 would claim a chain of R1 → R3 → Target: it queries the final target of R1 getting R3, then of R3 getting Target.

Ironically, this is good news! I hope.

It means clients can detect the value of $wgMaxRedirects by comparing the wikitext with the data provided by redirects=1.
i.e. we can verify that $wgMaxRedirects is 1 if the wikitext redirect target is the same as the first item in redirects=1

Fetching the real chain would involve loading the content of every page in the chain to extract the redirect target from it, since we don't actually store the intermediate points in the chain anywhere. Or else changing how we store redirects.

Not necessarily advocating the following, but ...

It will only involve loading the content for $wgMaxRedirects - 1 pages in the chain using the current storage, if I understand it correctly.

For R1 → R2 → R3 → R4 → Target , with $wgMaxRedirects = 2, only the target of R1 needs to be extracted, as then the two chains can be merged together

R1 → R3 → Target
   R2 → R4 → Target
XZise added a comment.Jul 15 2015, 6:10 PM

For R1 → R2 → R3 → R4 → Target , with $wgMaxRedirects = 2, only the target of R1 needs to be extracted, as then the two chains can be merged together

R1 → R3 → Target
   R2 → R4 → Target

Well it'll need to extract $wgMaxRedirects - 1. Given R1 → R2 → R3 → R4 → R5 → R6 → Target, if $wgMaxRedirect is:

  • 1: All are returned, no additional stuff must be done.
  • 2: 1, 3, 5 and Target are returned by default, additionally R2 (and it's chain) must be done.
  • 3: 1, 4 and Target are returned, additionally R2 (→R5→Target) and R3 (→R6→Target) must be done.
  • 4: 1, 5 and Target are returned, additionally R2 (→R6→Target), R3 (→Target) and R4 (→Target) must be done

Another question is how do circular redirects work? Is R1 → R2 → R3 → R1 returned as R1 –(R2)→ R3 –(R1)→ R2 –(R3)→ R1?

Krinkle moved this task from Untriaged to API on the MediaWiki-Redirects board.Jul 31 2017, 9:19 PM