"Next 200" link in category page broken
Closed, ResolvedPublic

Description

Author: dbenbenn

Description:
At [[Category:Images with unknown copyright status]], the "next 200" link is

http://en.wikipedia.org/w/index.php?title=Category:Images_with_unknown_copyright_status&from=0212006

If you click the link, all the images that were visible before are still visible.
Furthermore, the "next 200" and "previous 200" links both use "from=0212006".


Version: 1.6.x
Severity: normal

bzimport added a project: MediaWiki-Categories.Via ConduitNov 21 2014, 9:05 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz4912.
bzimport created this task.Via LegacyFeb 8 2006, 12:51 AM
bzimport added a comment.Via ConduitFeb 11 2006, 9:42 PM

gangleri wrote:

Hallo!

from=0212006 is the "category key".
http://en.wikipedia.org/w/index.php?title=template:no%20license&action=edit
http://en.wikipedia.org/w/index.php?title=template:no%20license&action=history
shows how this happened.
http://en.wikipedia.org/w/index.php?title=Image%3ATheUsed.JPG&diff=37823581&oldid=26403756
updated the sort keys and the links.

In the category the first and the last image have the same sortkey. This is a
dead lock.
Bug 2177: Expenading sort order BEHIND the sort key
would offer a workaround for such situations.

Users should specify 'SORT KEY'. MediaWiki should use
{{SORT KEY}}{{NAMESPACE}}:{{PAGENAME}} internaly.

best regards reinhardt [[user:gangleri]]

bzimport added a comment.Via ConduitFeb 11 2006, 9:43 PM

gangleri wrote:

Oops! MediaWiki should use
{{SORT KEY}}{{PAGENAME}}:{{NAMESPACE}} internaly.

brion added a comment.Via ConduitMar 13 2006, 10:02 PM
  • Bug 5241 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitMar 14 2006, 2:57 AM

William.Allen.Simpson wrote:

(copied from Bug 5241)
I've also reported at
http://en.wikipedia.org/w/index.php?title=Wikipedia%3AVillage_pump_%28technical%29#Category_Sort_Blank_bug

When there are a large number of items in category sort blank
(category with pipe blank "[[...| ]]"), the next 200 do not appear.

Likewise, jumping to 0-9 and then trying previous 200, they do not
page back to the beginning. So the problem is both directions.

In [[:Category:Redirects with possibilities]], you can see the
problem. It's existed this way for several months, so please
don't fix this until the developers can look at it!

Yes, I know the problem is a bad category in [[Template:R to decade]].
But again, it's been that way for months, so leave it alone for
testing purposes.

I've already fixed (during the past two weekends) the problem in 4
templates that made this same massive 6,000+ entry bug at
[[:Category:Unprintworthy redirects]]. That's why I was looking for
more examples, to determine whether it was a one time thing.

So, how long before this known bug will be fixed?

bzimport added a comment.Via ConduitApr 28 2006, 1:25 AM

gangleri wrote:

(In reply to comment #2)

Oops! MediaWiki should use
{{SORT KEY}}{{PAGENAME}}:{{NAMESPACE}} internaly.

Some more details on this:

{{SORT KEY}}{{special character}}{{PAGENAME}}{{special
character}}{{GENERICNAMESPACEORDER}}

a) {{special character}} should be a character with a value less then  
tab ' ' could be used

b) {{GENERICNAMESPACE}} or better {{GENERICNAMESPACEORDER}} would avoid
interference with namespace localization.

bzimport added a comment.Via ConduitJun 23 2008, 6:09 AM

verdy_p wrote:

Actually, the effective sort key should use at least 3 levels to match with the Unicode UCA algorithm:

  1. The provided key converted internally to all capitals, and with all non-letters converted to spaces, then all spaces compacted
  2. The provided key with its original case with just non-letters converted to spaces then all spaces compacted
  3. The provided key as is.

Note: the first character of the part 1 must not be changed to a space but must be kept if it's not a letter. However it should be capitalized if it's a letter. the reason is that it will be used to generate distintive subgroups in the displayed list. If it's a space, it must be preserved even if the rest of spaces after it can be compressed.

To compress the resulting key, the part 2 can be trimmed for the characters at end that are common at end of part 2. The same can be done for part 3 (but it must still be compared with the original part 2).

Then to form the effective sortkey, the three parts should be concatenated with a separator lower than a space. If part 3 is empty, you don't need to concatenate it and its leading separator; if both parts 2 and 3 are empty, you don't need to concatenate both of them and their leading separator. If part 2 is empty but not part 3, then the empty part 2 must sill be generated (meaning that part 3 will be separated from part 1 by two separators).

Finally, as the provided sort key is not necessarily unique (it may be different from the full page used by default, including the namespace name and colon), an additional part should be added with a separator to the previous key; it won't be needed if the (non-compressed) part 3 (the original provided sort key) is identical to the full page name.

Note that Wikipedia still uses by the full page name for its default sort key; it would probably be better if it used by default only the page name, then a separator, then the namespace, because it would avoid having to specify <nowiki>{{PAGENAME}}</nowiki> as the explicit sort key in many pages. Note that the namespace itself could be compressed by replacing it with just the namespace number right-padded on 3 characters by filling zeroes.

This should work correctly in wiktionary where it could be tested on long lists of words in various languages. (this algorithm is already used on French Wiktionary with a template generating the sort key, however it lacks the string reversal for part 3, needed for corect French sort order... This template-based implementation however is quite tricky; and because a tab is not handled correctly within categories, the separator chosen there is " !", a space followed by an exclamation.)

Ideally, MediaWiki should implement the full UCA algorithm, however the computed sortkeys will not be displayable even though they will probably be even more compact. With the full UCA, you'll be faced to problems like tailorization per language, so that multi-character graphemes recognized like one letter will sort correctly, and so that the tailored capitalization rules for that language (using special case mappings like Turkish or Azeri for the conversion of dotted-i and undotted-I) will work reliably. also this may be needed when non-letters are used as part of the language alphabet (such as apostrophes in the middle of the grapheme cluster making a single letter).

bzimport added a comment.Via ConduitOct 20 2008, 2:50 PM

ayg wrote:

*** Bug 16021 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitOct 20 2008, 2:50 PM

ayg wrote:

Note that the cl_sortkey field is varchar(70). We simply don't have the space to concatenate much of anything to the end. We could reserve the last several bytes for some encoding of the page_id, but that's really unnecessary: we already have an index on (cl_to, cl_sortkey, cl_from), so a sort on (cl_sortkey, cl_from) would provide unique results without having to modify existing sortkeys. So the URL would like like

http://en.wikipedia.org/w/index.php?title=Category:Images_with_unknown_copyright_status&from=0212006&fromid=12345

Of course, the results would be sorted in a meaningless order when sort keys are the same, but the order would be consistent, and so this bug would not occur. To make the sort order slightly more meaningful, we could append the full page name to custom sort keys, keeping in mind that it would quite likely get truncated for long page names.

Using the full UCA algorithm would be great, anyone up for implementing that? :) That would solve bug 164, but isn't needed here at all. Given current space constraints of 70 characters (when page titles are up to 255 + namespace), we can't use anything as complicated as three sorting levels without a schema change. We can't even use one sorting level and expect it to be unique.

Assigning to self, although I can't promise I'll get to it anytime soon. This will need to modify IndexPager so it can sort on two columns at once.

bzimport added a comment.Via ConduitOct 20 2008, 5:16 PM

verdy_p wrote:

See my comment in bug 164 for how the pseudo-UCA sort works in French Wiktionnary (I designed it, it still has some known caveats and limitations, but it works remarkably well, and the way it is implemented allows automating the feeding of sort keys, with an algorithm that is quite easy to understand.)
Look at the documentation page of "[[Modèle:clé de tri]]" (French for "Template:sort key"), it is written in French, but if you need help and can't read French, ask me or to some admins in French Wiktionnary).

Catrope added a comment.Via ConduitOct 21 2008, 3:49 PM

(In reply to comment #8)

Of course, the results would be sorted in a meaningless order when sort keys
are the same

IMO, the sort order for duplicate sort keys doesn't *have* to be meaningful: users should know that the sorting order for duplicate sort keys is undefined, or at least that if they want control over the sort order, they should just use unique sort keys.

bzimport added a comment.Via ConduitOct 23 2008, 12:09 AM

ayg wrote:

Agreed. Also, it would mess up the URLs somewhat if we started appending the article title all over the place. Maybe leave that for later, for now the id would be fine for disambiguation (when someone wants to do it).

bzimport added a comment.Via ConduitSep 23 2009, 9:09 PM

chinchi29 wrote:

I can't reproduce this. Was this fixed?

bzimport added a comment.Via ConduitSep 23 2009, 11:13 PM

ayg wrote:

No, AFAIK, although the exact page linked to doesn't show the problem anymore. To reproduce, add [[Category:Bug 4912 test case| ]] or something to more than 200 pages on some wiki or other, and then observe that the resulting category page doesn't paginate correctly.

Unassigning from self since I'm not likely to do anything about this in the foreseeable future.

TheDJ added a comment.Via ConduitJun 21 2010, 11:30 PM
  • Bug 23803 has been marked as a duplicate of this bug. ***
Svick added a comment.Via ConduitSep 26 2010, 9:24 PM

Note that this bug also occurs when using the API: I tried enumerating subcategories of [[Category:Gastropod genera without authority reference]] and because more than 200 pages had the sortkey of space, my program entered infinite loop.

bzimport added a comment.Via ConduitSep 26 2010, 9:34 PM

ayg wrote:

This is fixed for common cases in trunk with the categorylinks rewrite. I don't know if the API has caught up yet. It's still not totally bulletproof, if the custom sort key is really long (200+ bytes) -- we need to take the id into account for that. So I'll leave the bug open.

Bawolff added a comment.Via ConduitSep 26 2010, 9:36 PM

(In reply to comment #15)

Note that this bug also occurs when using the API: I tried enumerating
subcategories of [[Category:Gastropod genera without authority reference]] and
because more than 200 pages had the sortkey of space, my program entered
infinite loop.

Note, thats possible to work around (in the api that is for whatever version wikimedia wikis are using) by not using the cmnamespace parameter and filtering by namespace on the client side.

Gustronico added a comment.Via ConduitMar 8 2011, 10:35 PM

Right now, "Next 200" links are broken in *all* categories containing 200+ items. Tested in en:wiki and es:wiki.

Bawolff added a comment.Via ConduitMar 8 2011, 11:54 PM

(In reply to comment #18)

Right now, "Next 200" links are broken in *all* categories containing 200+
items. Tested in en:wiki and es:wiki.

That's a separate issue (So in general its probably a separate bug). However its probably related to the fact we're in the middle of changing the ways categories work (change to how articles are sorted, for multilingual goodness and what not), so its probably a temporary issue well the categories get updated, and should go away on its own.

PrimeHunter added a comment.Via ConduitMar 9 2011, 12:16 AM

The "next" link in categories currently has a url saying &pagefrom=.
It partially works if the url is manually changed to &from= as it has said before.
I only say partially works because it apparently only considers the first character after &from=.

Catrope added a comment.Via ConduitMar 9 2011, 11:11 AM

(In reply to comment #20)

The "next" link in categories currently has a url saying &pagefrom=.
It partially works if the url is manually changed to &from= as it has said
before.
I only say partially works because it apparently only considers the first
character after &from=.

This was fixed at 00:43 UTC.

Catrope added a comment.Via ConduitMar 9 2011, 11:12 AM

Closed as FIXED because the bug as filed is also fixed on WMF, per comment 19.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.