Page MenuHomePhabricator

Wrong message dates in Flow on certain OS/architectures
Open, LowPublic

Description

I just installed flow on a wiki and it seems to work great except that the posts' dates are all wrong: they appear to be from 46 years ago, although I just added them.

I checked the server clock and it is correct.

Event Timeline

gregoire raised the priority of this task from to Needs Triage.
gregoire updated the task description. (Show Details)
gregoire subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

When you say "46 years ago", do you mean the posts are dated to December 31, 1969? (Or perhaps January 1, 1970?)

What version of core, and what version of Flow are you using? Are you using them from a git branch or a tarball?

Is the wiki public (that would let us test directly)?

@Catrope they are dated January 1, 1970. This sounds suspiciously like the unix epoch but the date is not constant (i.e. time passes, a comment added today is dated January 2, 1970).

An other strange thing is that, right after I add a new comment to a topic, all the posts in this particular topic have the right date:

Selection_002.png (524×865 px, 21 KB)

But when I reload the page, the dates go back to 46 years in the past:
Selection_003.png (521×862 px, 22 KB)

@Mattflaschen I use mediawiki 1.25.1 and Flow tarbal Flow-REL1_25-576f705.tar.gz (Flow has no version in the Special:Version page).

The wiki is password protected but I can probably arrange to give you access if that would help.

@Catrope they are dated January 1, 1970. This sounds suspiciously like the unix epoch but the date is not constant (i.e. time passes, a comment added today is dated January 2, 1970).

That is strange.

An other strange thing is that, right after I add a new comment to a topic, all the posts in this particular topic have the right date:

In this case, the date is rendered by the client.

I'm trying to look at this and I have trouble understanding where the problem comes from.
Here is the HTML node that shows the date of a post:

<span datetime="338560000" class="flow-timestamp">
	<span class="flow-timestamp-user-formatted">January 4, 1970 11:02 PM</span>
	<span id="zmys8cnabb" class="flow-timestamp-ago">46 years ago</span>
</span>

The timestamp there gives me yet an other date:

$ date -d @338560000
Tue Sep 23 14:26:40 CEST 1980

But if I remove the three last 0, at least I get the displayed date:

$ date -d @338560
Sun Jan  4 23:02:40 CET 1970

An other post has a negative time stamp:

<span datetime="-533037000" class="flow-timestamp">
	<span class="flow-timestamp-user-formatted">December 25, 1969 8:56 PM</span>
	<span id="e5olkr1gt9" class="flow-timestamp-ago">46 years ago</span>
</span>

Ok, I tryed to look in the database. If I understand correctly, the posts are in the table flow_revision and the only field that looks like a timestamp there is flow_revision.rev_mod_timestamp but in my case, the value is always NULL.

mysql> select rev_mod_timestamp from flow_revision;
+-------------------+
| rev_mod_timestamp |
+-------------------+
| NULL              |
| NULL              |
| NULL              |
| NULL              |
| NULL              |
| NULL              |
...

So unless I should look elsewhere for the timestamp value, I'm guessing that this is what's causing my problem with the display of dates?

So unless I should look elsewhere for the timestamp value, I'm guessing that this is what's causing my problem with the display of dates?

No. The timestamp of the revision is stored as part of the UUID. See https://phabricator.wikimedia.org/diffusion/EFLW/browse/ebdf52d837de/includes/Formatter/RevisionFormatter.php;ebdf52d837de9c63be3b03d9eb3021f965823367$181 . rev_mod_timestamp is only set when the revision is a moderation action (moderation actions create new revisions).

Do you mean rev_id? That seems to be a binary blob, how can I inspect it?

Do you mean rev_id? That seems to be a binary blob, how can I inspect it?

Yes, it is binary. You can use:

SELECT LOWER(HEX(rev_id)) WHERE ...

With the lower hex, you can do (in a PHP shell):

Flow\Model\UUID::create( $lowerHex )->getTimestamp()

Also, the UUIDs are in chronological order when the binary value is sorted.

I have the same bug on a wiki I installed for a client; this one is public on " remix [subdomain of] antipode-mjc DOT com " (I prefer its domain is not indexed by search engines). You can find tests about these wrong dates on the pages "User_talk:WikiSysop" and "User_talk:Léa_Bibli". The server is in CET/CEST timezone (UTC+1/UTC+2).

I used the process given by @Mattflaschen to retrieve some values from the database. For the discussion on "User_talk:WikiSysop" I just added (2015-11-25T14:25:18+0100), I have:

  • LOWER(HEX(rev_id)) = 0544fb475cf5045a990fee

When I ask the UUID with " Flow\Model\UUID::create( $lowerHex )->getTimestamp() ", I obtain:

  • 20151125132518 on a local install on my computer (MW master=1.27alpha/90720a3, Flow master=1.1/602c885, PHP 7.0.0RC7)
  • 19691231182958 on the wiki where the is the bug (MW 1.25.1 from tarball, Flow tarball REL1_25 576f705, PHP 5.6.6)

I further examinated the code and the different results, with the following code, extracted from Flow/includes/Model/UUID.php, function hex2timestamp:

<?php

$hex = '0544fb475cf5045a990fee';
$msTimestamp = hexdec( substr( $hex, 0, 12 ) ) >> 2;

echo substr( $hex, 0, 12 )."\n";
echo hexdec( substr( $hex, 0, 12 ) )."\n";
echo $msTimestamp."\n";
echo intval( $msTimestamp / 1000 )."\n";

I obtain on my local install with PHP 5.6.12:

0544fb475cf5
5793831673077
1448457918269
1448457918

and on the wiki where there is the bug (on a OVH shared hosting), with PHP 5.6.6:

0544fb475cf5
5793831673077
-19802307
-19802

I tested with various PHP versions, both on my computer (5.5.6, 5.6.12, 7.0.0RC7) and on OVH shared hosting (5.3.29, 5.4.38, 5.5.22, 5.6.6, 7.0.0RC5). It really depends on the computer, not the PHP version.

The bug is located on the bitwise shifting ( $a >> 2). So the differences comes from different internal representations of the integers (32-bit/64bit or big/little-endian); I’m not really sure of me when I manipulate these different representation, but I can come with a workaround, which works on both computers:

<?php

$hex = '0544fb475cf5045a990fee';
$hexTimestamp = hexdec( substr( $hex, 0, 12 ) );
$msTimestamp = $hexTimestamp >> 2;
$msTimestamp2 = ($hexTimestamp-($hexTimestamp%4))/4;

echo substr( $hex, 0, 12 )."\n";
echo hexdec( substr( $hex, 0, 12 ) )."\n";
echo $msTimestamp."\n";
echo intval( $msTimestamp / 1000 )."\n";
echo intval( $msTimestamp2 / 1000 )."\n";

Now the last line is always correct: 1448457918. But such a change must be double-checked by someone, who better masters this than me, and perhaps there are performance issues with this solution.

PS: I introduce this change on the wiki mentionned above; the dates on the right correctly changed, but not on the title, probably it is computed elsewhere with the same formula.

I have some other elements, for information:

  • on my local computer, the system constant PHP_INT_SIZE, the size of the integers, is 8 (=64bits); I have an Intel Core i3;
  • on the OVH shared computer, PHP_INT_SIZE is 4 (=32bits); it’s an AMD Opteron.

When PHP cannot represent an integer because of its size, it changes it to a float, so in the example above, $hexTimestamp is a float and the shifting is applied on this float (IEEE 754 64bits if I understood correctly, even when PHP_INT_SIZE is 4). According to the doc, shifting is always applied on integers (but I didn’t find how the type casting float->integer is done, since it’s too big for an integer; there is probably an overflow somewhere).

@Mattflaschen I tried to follow how is computed the "46 years ago" in the title, but I didn’t succeed for now. Do you know where the origin computation comes from? Is it another representation than hexadecimal in class UUID?

What I found for now is: it is displayed in handlebars/timestamp.handlebars, called from includes/TemplateHelper.php (TemplateHelper::timestamp).

Mattflaschen-WMF renamed this task from Wrong message dates in Flow to Wrong message dates in Flow on certain OS/architectures.Nov 27 2015, 8:02 PM
Mattflaschen-WMF set Security to None.

I still have to test on the wiki, but basically the problem is when signed integers are only 32 bits; it is sort of "bug of year 2038" but almost permanent on these architectures because the unit is milliseconds instead of seconds (only January of 1970 can be represented). A workaround is to use floats instead of integers since floats are always 64 bits (with 53 bits for the significand => 285,426 years can be represented).

The first step to fix this is to modify UUID::hex2timestamp by remplacing the " >> 2 " shift by a division by 4 (rounded to zero, or equivalently remove the number modulo 4 and divide by 4); this works because the operation is made on floats instead of integers. Then, second step, since the last modification of a discussion is saved in table flow_workflow, column workflow_last_update_timestamp, existing discussions must be updated. And possibly, when there are divisions or other numerical operations, these must be done on floats or strings, e.g. in classes TemplateHelper and TopicListFormatter. For instance, for a string representing a decimal number, one can multiply by 1000 simply by appending '000' at the end of the string (in TopicListFormatter::generateTopicMetadata).

Thanks for doing this research. We could also look at using BCMath where helpful. We're already using bcadd in one place.

Note, more discussion about this, in https://www.mediawiki.org/wiki/Topic:T39dw6ol5ys9riqa by two unconnected 3rd party potential extension users