Page MenuHomePhabricator

IAB incorrecly uses older version of archived page when there is a newer version available that should be used
Open, Needs TriagePublicBUG REPORT

Description

Steps to Reproduce:

See this edit by IAB

https://sv.wikipedia.org/w/index.php?title=SI-enhet&diff=46292857&oldid=46024503

Actual Results:

Archived version from 23 December, 2018 is used

https://web.archive.org/web/20181223053741/https://www.bipm.org/en/measurement-units/

This version contains "old facts" that are not to be used as a source in the article.

Expected Results:

Archived version from 27 June, 2019 should have been used.

https://web.archive.org/web/20190627034200/https://www.bipm.org/en/measurement-units/

This version contains "new facts" that are used as a source in the article

Questions:

  • Why is the old version from 2018 choosen by IAB?
  • Does IAB have a problem in interpreting the Swedish version of the Webbref template? Parameter "hämtdatum" is the one that corresponds to "accessdate" and the value "27 juni 2019" means "June 27, 2019".

Event Timeline

Larske created this task.Oct 3 2019, 4:27 PM
Restricted Application added a project: Internet-Archive. · View Herald TranscriptOct 3 2019, 4:27 PM
Restricted Application added a subscriber: Cyberpower678. · View Herald Transcript

https://tools.wmflabs.org/iabot/index.php?page=manageurlsingle&url=https%3A%2F%2Fwww.bipm.org%2Fen%2Fmeasurement-units%2F suggests that it found the source with the access date of 2018-11-23 somewhere on some wiki. IABot picked the closest snapshot to that and saved it for future use as it is a guaranteed working snapshot with similar, if not the same, data as the original from that time frame. If you want the bot to use a different snapshot, you can use the provided link and change the archive URL the bot should use. Changes are applied immediately.

Thanks for the answer. I assume that "it found" should be read as "IAB found".
What I don't understand is how "somewhere on some wiki" should be read or interpreted.
Why should "somewhere on some wiki" have an influence on a specific reference, including a specific access date, in a specific article in a specific wiki?
We do not want IAB to replace a reference to a site with a random snapshot that is "a guaranteed working snapshot with *similar* data". We want a snapshot that is likely to contain the *same data* as was present at "the specified access date". In this particular case I am told that there is a significant difference between the contents from December 2018 and from June 2019.

In the article https://sv.wikipedia.org/wiki/Positron on Swedish Wikipedia it was stated in the Webbref template that the accessdate was 12 June 2019 (hämtdatum=12 juni 2019). And in the article https://sv.wikipedia.org/wiki/SI-enhet the accessdate was 27 juni 2019.
Shouldn't that information be used as a basis to find snapshots with dates closer to these dates in June 2019 than a snapshot from previous year (23 December 2018).

It seems risky to use the "Modify URL data" interface to change what snapshot IAB should use for this URL as there might be other references in other articles in svwiki (or maybe even other wikis, what is the scope for "Modify URL data"?) that have other access dates that should be used, in the future if not today. Is it possible to remove the content of the "Archive URL" field completely in this "Modify URL Data" interface in order to have IAB to "build a new opinion" on which snapshot to use the next time it encounters this URL. By the way, the URL is not dead today, see https://www.bipm.org/en/measurement-units/ but it may have been temporarlily unavailable when IAB tested it on 5 September 2019.

I repeat my question: Is there something in the template (Webbref) usage in the two articles mentioned above that IAB does not understand?
If so, we may have to correct the template usage in several other articles as well.
In particular, do IAB understand the parameter "hämtdatum" (accessdate) in the Swedish template? If so, why isn't it used to find a snapshot?