Page MenuHomePhabricator

Table sorting broken by colspan/rowspan
Closed, ResolvedPublic

Description

Author: circeus

Description:
Tables with separating headers cannot use the new "sortable" class without
breaking. Example of tables where this would be useful are Tornado outbreak
tables, like http://en.wikipedia.org/wiki/Late-November_2005_tornado_outbreak
and http://en.wikipedia.org/wiki/Wisconsin_Tornado_Outbreak_of_August_2005. See
also http://en.wikipedia.org/wiki/List_of_Anuran_families. Implementing a
solution to [[Bug 4740]] (and limiting the sort to within <tbody>s) would solve
this.


Version: unspecified
Severity: enhancement

Details

Reference
bz8028

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:28 PM
bzimport set Reference to bz8028.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

For the time being, it could just look for something with a colspan spanning the
whole table and consider that a break. Smaller colspans would need to be broken
up for this to work sanely, too.

joost wrote:

Actually, sorting is restricted to within the tbody in the new version of the script you're using, didn't that solve this
problem?

ayg wrote:

No, because <tbody> and friends aren't available in MediaWiki markup (see bug
4740). At least not right now. If that were fixed, this would greatly reduce
the need to handle colspan/rowspan. It wouldn't eliminate it, however: consider
http://en.wikipedia.org/w/index.php?title=Comparison_of_layout_engines_%28CSS%29&oldid=98156450#Properties
How to deal with that is an interesting question, though, because I'm not sure
what sort of heuristics would be appropriate generally. Some handling would
probably be better than none, though: maybe breaking the merged cells into
separate cells with colspan=rowspan=1 with the same contents as the original for
the purposes of sorting, then after sorting is done, remerging any adjacent
cells with the same contents?

joost wrote:

hmmm, that isn't to simple, but more people seem to want that :) let me think about it...

josh wrote:

Sorting may not be perfect, but errors can be avoided if the return value of function ts_getInnerText(el) is changed for undefined elements:

function ts_getInnerText(el) {
if (typeof el == "string") return el;

Begin Changes
if (typeof el == "undefined") { return el };
return empty string instead
if (typeof el == "undefined") { return "" };
End Changes

if (el.innerText) return el.innerText; //Not needed but it is faster
var str = "";

var cs = el.childNodes;
var l = cs.length;
for (var i = 0; i < l; i++) {

		switch (cs[i].nodeType) {
			case 1: //ELEMENT_NODE
				str += ts_getInnerText(cs[i]);
				break;
			case 3:	//TEXT_NODE
				str += cs[i].nodeValue;
				break;
		}

}
return str;
}

jimhu wrote:

It appears that it's not just colspan/rowspan.. the rows all have to have the same number of cells.

tcncv_nospam wrote:

I have taken on the task to get this implemented and a currently have a test version that seems to work fairly well. To summarize, the enhancements do the following.

  1. Before sorting, rowspans will be exploded, so that each row is self contained and can be moved without corrupting the table structure.
  2. During sorting, colspans are recognized and counted when retrieving column values, so that the proper sort value is retrieved from each row. Each column in a colspan range is treated as having the same value. Colspans are preserved – they are not split.
  3. After sorting, some cell ranges may be recombined under certain restrictive conditions (still being refined). Also, the class="autorowspan" option can be applied to column headers or the entire table to enable more aggressive rowspan combines, such as combining cells in the newly sorted column that were not originally combined.

My current demo code in [http://en.wikipedia.org/wiki/User:Tcncv/sorttables.js], and some sample tables are located in [http://en.wikipedia.org/wiki/User:Tcncv/Table_Sort_Demo]. The demo requires that you add "importScript('User:Tcncv/sorttables.js');" to your User:Xxx/monobook.js file (or whatever skin you use).

I plan on doing some code cleanup and more testing before formally submitting the change (with documentation and test cases). I have tested it on recent versions of Firefox, Netscape and IE, but will need help testing on other browsers and older versions. I have already received some encouraging feedback from some WikiProject Olympics editors, but would like to hear from a wider audience. If you have the time, please take a look and let me know what you think.

ayg wrote:

You're assuming that it's logical to break up a cell with rowspan > 1 into multiple cells with rowspan = 1, each with the contents of the original? I imagine that would fail in some cases, but it seems unlikely that there's any other sane way to handle these.

tcncv_nospam wrote:

(In reply to comment #8)

Yes, that seemed the obvious approach. Potentially, logic could be written to identify rowspanned cells that would retain their adjacency after the sort, leaving them intact, but that could get very complicated. In order to keep the enhancements modularized and minimize changes to the core sort routine, I chose to take a pre-processing and post-processing approach.

The fact that it is necessary to break up the cells is somewhat mitigated by the third item above – logic to merge some cells after the sort. Currently, merging is limited to the currently sorted column (or adjacent, sequentially selected columns) and will only merge those cells that were originally merged (tracked by cell ID), unless the row header or table has class="autorowspan" present, which enables more aggressive merges operations.

Since eventually, we are going jQuery. I note that there is a MIT/GPL license jQuery module Tablesorter that handles col/rowspan. We might want to look into that in the future.

Since we now have jQuery... The tablesroter looks really nice at least in theory (I haven't tested this on wiki).

Assigning to self, since I'm working on implementing it.

Diff between stock tablesorter.com and MW version

Attached:

This patch adds the jquery tablesorter

I converted the tablesorter.com plugin for MW. Changes include:

  1. Sites can specify custom collations.

The script accepts an object "tableSorterCollation" which contains a lookup table, how specific characters should be treated.
For example, after setting "tableSorterCollation={'ä':'ae', 'ß':'ss'};" in the site's common.js any string containing an ä or Ä will be sorted as if it were a 'ae'.

  1. Table rows can be forced to use a specific data type.

By setting class="sort-{Parsername}", the row will be parsed with the specified algorithm. class="sort-date" would force date sorting etc.
The following parsers are available: text, IPAddress, number, url, currency, date, isoDate, usLongDate, time

  1. Execution time is reduced by half or more.

Sorting a 935 row * 8 columns table:

Browser Before After

Chrome 10 90ms 42ms
Safari 5 115ms 48ms
Firefox 4 412ms 87ms
IE8 720ms 115ms

  1. Based on the content language and the mdy vs dmy preference, the parser can understand dates such as "17. März '11". wgMonthNames=[] and wgMonthNamesShort=[]

in the content language and the mdy vs dmy preference are exported to js; A table containing the following dates would be sorted correctly:

  1. Jan. 01

23 Feb 1992
9.02.05
13 November 2001
14 Oktober '76

Notes: This patch requires the patch from Bug 4740. It was tested in ie6-8, chrome, safari 5, ff3 & ff4

attachment new.diff ignored as obsolete

is it backward compatible with our current class naming yet ?

This patch adds the jquery tablesorter

Yes, .sortable is obviously supported, .unsortable as well.
The hacking with sortBottom is not needed anymore, since the patched parser would create an (unsorted) <tfoot> for the following syntax.

-class="sortbottom"

!Total: 15!!!!!!Total: 29.55!!

}

(updated patch fixes a bug and has better readability in one section)

Attached:

Nice, i'll makes some test runs locally, but this looks very fine.

Do you have ideas on how to keep the disjoint between the stock tablesorter code and our code as small as possible ? That would help future maintenance.

The first line of the patch is bogus, and unfortunately, I can't get the code to run atm. Do i need to enable anything somewhere ?

yep, should have excluded the images, sry. Did you apply Bug 4740 ?

Afaik the jQuery tablesorter code needs to have the heading and footer in thead and tfoot. Currently this is unsupported in MediaWiki (bug 4740), so this script is not needed yet (atleast not yet to be loaded by default on every page)

*** Bug 17515 has been marked as a duplicate of this bug. ***

Forgot to close it.
Col & Rowspans work after r86088 and followups

  • Bug 28859 has been marked as a duplicate of this bug. ***

r86088 is seriously broken and may get reverted; reopening. Needs unit tests (see initial basic unit tests in r90595 which show up errors; will also need tests for the particular case this bug is about.)

A test case for rowspan was added in r90657, confirming that rowspan groups are being split up into individual duplicated cells prior to sorting. I don't see any evidence of the re-merging described above in comment 7, so that might not have made it into the new version.

There's also no test treatment of the issue originally mentioned in this bug -- tables using colspan groups to add breaks / subheaders to a long table such as on http://en.wikipedia.org/wiki/Late-November_2005_tornado_outbreak

Way up top there's a mention of the idea that fixing bug 4740 (support for thead, tbody, tfoot elements of table) would allow that to be resolved -- I think by letting folks group each set of rows into a separate <tbody>, and sorting within each one...?

I presume this would require setting the breaker rows to be individually unsortable, and grouped into the <tbody>s that they describe (they can't be separate <thead> chunks, you can only have one <thead> at the top).

So the table at [[Wisconsin_Tornado_Outbreak_of_August_2005]] might be restructured something like:

<table> <thead> <tr><th>F#</th>...</tr> </thead> <tbody> <tr><th colspan="7">Minnesota</th></tr> <tr><td>F0</td>...</tr> </tbody> <tbody> <tr><th colspan="7">Wisconsin</th></tr> <tr><td>F0</td>...</tr> <tr><td>F1</td>...</tr> </tbody> </table>

(In reply to comment #26)

Way up top there's a mention of the idea that fixing bug 4740 (support for
thead, tbody, tfoot elements of table) would allow that to be resolved -- I
think by letting folks group each set of rows into a separate <tbody>, and
sorting within each one...?

Currently you can specify tfoot and thead, but not tbodies: Any number of rows that's not separated by a thead is wrapped by a tbody. I don't really see the need for a new syntax for tbodies since the problem is trivial to solve: Just split it into seperate tables. You could even style them with margin:0 to make them look like only one...

sumanah wrote:

+reviewed (inferring that patch has been reviewed sufficiently in the last several comments; if it hasn't been reviewed enough, please change "reviewed" to "need-review")

Now that bug 38911 is fixed, I think we can finally close this ticket. We now handle rowspan and colspan, in both headers and content quite extensively.

Hello,

This bug should be reopened as it's not yet resolved.

When tables only have "rowspan", sorting works well. When there is "colspan" it works only partially (for the columns before the "colspan"). And when both "colspan" and "rowspan" are present, it's havoc.

Cf. the first tables of [[fr:Prix Eisner|Prix Eisner]] on the French Wikipedia.

Gwendal: Please file a new separate ticket with steps to reproduce.

(In reply to Andre Klapper from comment #31)

Gwendal: Please file a new separate ticket with steps to reproduce.

Hi Andre, I don't understand what you mean by "Step to reproduce". What kind of steps?

Gwendal: Please file a new separate ticket and follow https://www.mediawiki.org/wiki/How_to_report_a_bug - thanks!

(In reply to Andre Klapper from comment #33)

Gwendal: Please file a new separate ticket and follow
https://www.mediawiki.org/wiki/How_to_report_a_bug -

Hello, here is the new report
[[Bugzilla:72534]]