Page MenuHomePhabricator

Talk page meta data inconsistent
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Visit Scott McKenzie Talk page, Barack Obama Talk page and Darjeeling talk page

What happens?:
Inconsistent metadata displayed at a topic level

What should have happened instead?:
Number of replies, and date should be displayed for all topics on all pages

Other information (browser name/version, screenshots, etc.):
Version 2.7.50417-r-2022-08-02

Screenshot_20220816-204616.png (2×1 px, 487 KB)

Screenshot_20220816-204533.png (2×1 px, 366 KB)

Screenshot_20220816-204642.png (2×1 px, 329 KB)

Event Timeline

It looks like the timestamp fields in the DiscussionTools API are formatted differently for different articles, which is throwing off our deserialization logic.
For some articles they are in ISO 8601 format ([date]T[time]Z) as seen here:

https://en.wikipedia.org/w/api.php?action=discussiontoolspageinfo&format=json&page=Talk:Couscous&prop=threaditemshtml&formatversion=2

...but for other articles they seem to be in the plain database format (YYYYMMDDHHmmss) as seen here:

https://en.wikipedia.org/w/api.php?action=discussiontoolspageinfo&format=json&page=Talk:Darjeeling&prop=threaditemshtml&formatversion=2

@DLynch Is that expected? Will we plan to standardize them in one format or the other?
We can update our logic to handle both cases, but I'd like to make sure there aren't other potential cases, too.

@Dbrant this is indeed expected. Check out getTimestampString which explains exactly the cutoff date (see 109a50cb34c5219c6cf1c302da9c5a72dd11ce79). The reasoning, as I recall, is that the timestamp string is part of the ID and we didn't want to change already-existing IDs.

That said, I think we could change it so that we just give you a consistent timestamp. We're already building that string from a DateTimeImmutable after all, so we could standardize onto anything helpful. What's good for you here? Unix timestamp?

(I'd have to double-check and make sure that we're not actually relying on that timestamp field to be consistent with the ID anywhere, I suppose.)

Thanks, @DLynch --
Generally we're used to seeing ISO 8601 timestamps, which occur pretty often in other parts of the API. We've already updated our logic to account for both types of timestamps in this particular case, so as long as it won't be changed to something else again soon, I think we're good to go.

ppelberg edited projects, added Editing-team (Kanban Board); removed Editing-team.
ppelberg added subscribers: JMinor, Dbrant.

@JMinor: the Editing Team is preparing a change to the DiscussionTools API so that it offers clients a consistent timestamp format.

Can you please share what – if any – concerns the iOS team would have with the DiscussionTools API outputting timestamps in ISO 8601 format, per the suggestion @Dbrant shared in T315400#8166272?

Once the Editing Team hears back from y'all (iOS) on the above, we can make the change necessary to ensure the API outputs timestamps in a consistent manner.

@Tsevener please see previous comment, re: can iOS consume ISO 8601 timestamps

@DLynch @ppelberg Yes, we can consume ISO 8601 format, as long as it's consistent.

Currently we have this format set in our codebase - https://github.com/wikimedia/wikipedia-ios/blob/main/Wikipedia/Code/NSDateFormatter%2BWMFExtensions.m#L3. This formatter actually fails on these talk page dates because of the milliseconds, but I can easily add another formatter that considers milliseconds (i.e. @"yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'SSS'Z', and it successfully deserializes. So as long as it's always in that expected format, we should be good.

Amusingly enough, it's a case where the milliseconds are entirely superfluous -- our actual level of granularity here is per-minute... because everything here is based on parsing the dates in wikitext signatures, which are HH:mm.

@DLynch @ppelberg Yes, we can consume ISO 8601 format, as long as it's consistent.

Wonderful and thank you for the prompt reply, @Tsevener. We'll go ahead and update the DiscussionTools API to output timestamps in ISO 8601 format.

Change 827018 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/DiscussionTools@master] API: always output ISO8601 in the timestamp field

https://gerrit.wikimedia.org/r/827018

Change 828015 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/DiscussionTools@master] ThreadItem jsonSerialize: make sure callback is applied last

https://gerrit.wikimedia.org/r/828015

@DLynch @ppelberg Yes, we can consume ISO 8601 format, as long as it's consistent.

Currently we have this format set in our codebase - https://github.com/wikimedia/wikipedia-ios/blob/main/Wikipedia/Code/NSDateFormatter%2BWMFExtensions.m#L3. This formatter actually fails on these talk page dates because of the milliseconds, but I can easily add another formatter that considers milliseconds (i.e. @"yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'SSS'Z', and it successfully deserializes. So as long as it's always in that expected format, we should be good.

Note that ISO 8601 does not refer to a specific timestamp format: https://en.wikipedia.org/wiki/ISO_8601, for example the PHP implementation does not included milliseconds: https://www.php.net/manual/en/class.datetimeinterface.php#datetime.constants.iso8601_expanded

I think we can and should use an ISO8601-compatible format that doesn't include milliseconds. I'd rather users could handle strings with or without the seconds (as for DT it will always be ":00"), but maybe for simplicity and future compatibility we should include them.

formatoutput
old IDs2022-05-16T23:49:00.000Z
DateTimeInterface::ISO86012022-08-30T00:49:00+0000
new IDs20220830004900

To copy it in for reference, that'd be akin to 2022-08-24T11:15:21Z.

Which, incidentally, I think comes from wfTimestamp( TS_ISO_8601, $timestamp ).

Patch has been updated so that it'll have exactly the timestamps used by the action=query APIs.

Change 828015 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] ThreadItem jsonSerialize: make sure callback is applied last

https://gerrit.wikimedia.org/r/828015

Change 827018 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] API: always output ISO8601 in the timestamp field

https://gerrit.wikimedia.org/r/827018

Waiting until it's deployed and y'all confirm it works for you.

I was able to parse and display the Main_Page talk page in the beta cluster, which seems to be using the new format, so that looks good there from the iOS side. I haven't noticed the new format in production yet.

https://en.wikipedia.beta.wmflabs.org/w/api.php?action=discussiontoolspageinfo&format=json&page=Talk:Main_Page&prop=threaditemshtml&formatversion=2

@Tsevener it merged yesterday, so it won't be out across all production wikis until next week's train finishes next Thursday.

@Tsevener it merged yesterday, so it won't be out across all production wikis until next week's train finishes next Thursday.

Except for a few Wikipedias (Catalan, Hebrew, Italian if I didn’t miss any) that are included in group1 and thus should get it next Wednesday instead of Thursday, as well as testwiki and some closed wikis, which are group0 (Tuesday).