Page MenuHomePhabricator

Start adding on-wiki mappings for archive_url and archive_date
Open, Needs TriagePublic

Description

When generating a citation from an Archive.org URL, either manually (T298307, T98680) or automatically as a fallback (T95388), we should fill out archive_url and archive_date in the local template.

The parameters proposed in https://gerrit.wikimedia.org/r/c/mediawiki/services/citoid/+/375810 are:

  • archiveUrl
  • archiveDate
  • urlStatus - leaving this out for now due to the complexity of mapping to a localised attribute value (alive/dead, si/non etc.)

We should be able to immediately start modifying local wiki's templateData to map these onto local parameter names.

GlobalSearch query (looking for "citoid": { in the TemplateData maps):
https://global-search.toolforge.org/?q=%5C%22citoid%5C%22%3A+%5C%7B&regex=1&namespaces=10&title=

Event Timeline

@Mvolz I'm not sure how urlStatus would work given the value type is different per wiki:

  • on en.wiki it is a one of various strings (dead/live/deviated/unfit...)
  • on fr.wiki it is a boolean/datetime (brisé le broken since = 4 juin 2018 or oui).
  • on es.wiki it is a boolean (urlmuerta = [|no])
Esanders renamed this task from Start add on-wiki mappings for archive_url and archive_date to Start adding on-wiki mappings for archive_url and archive_date.Sep 16 2024, 12:48 PM

@Mvolz I'm not sure how urlStatus would work given the value type is different per wiki:

  • on en.wiki it is a one of various strings (dead/live/deviated/unfit...)
  • on fr.wiki it is a boolean/datetime (brisé le broken since = 4 juin 2018 or oui).
  • on es.wiki it is a boolean (urlmuerta = [|no])

However if we don't provide a url status, the default is "dead", even when the site is alive... it's okay for now as long as we're only doing for dead/inaccessible sites, but if it in the future we wanted to add it for live sites as well, then it's something we'd have to think about... but I agree at this stage it doesn't make sense to have it.

Though might get complaints i.e. if we add a link when we're blocked, and we're de facto marking nytimes links as dead because that's the default if we don't include the parameter... but I don't see an easy way around this anyway because if we're blocked we don't know for sure by code... could assume 415 is blocked and 404 is not available but so many different codes are used here)

Trizek-WMF subscribed.

For Tech News:

[technical] "When Citoid generates a reference based on an archive.org URL, we currently have no way of populating the archive-url and archive-date parameters in the citation template. The archive.org URL is considered the "original" URL, not as an archive link. The Editing Team is working on fixing this. We are asking communities to preemptively add mappings for these parameters to the citoid map within the TemplateData for each citation template."

For Tech News:

[technical] "When Citoid generates a reference based on an archive.org URL, we currently have no way of populating the archive-url and archive-date parameters in the citation template. The archive.org URL is considered the "original" URL, not as an archive link. The Editing Team is working on fixing this. We are asking communities to preemptively add mappings for these parameters to the citoid map within the TemplateData for each citation template."

Should include the actual parameter names (when we decide on them.... )

Should include the actual parameter names (when we decide on them.... )

Parenthesis matters here! :)

Either a documentation page-link, or an example-edit link (if it will be easily understood by non-English template-wranglers), would also be good to include in the future Tech News entry. Thanks!

If we want generic instructions, then as follows:

The following key-value pairs should be added to the maps > citoid entry in the TemplateData for each citation template:

"archiveUrl": "<name of the archive URL parameter in the current template>"
"archiveDate": "<name of the archive date parameter in the current template>"

for example, at English Wikipedia's Template:Cite_web this would be:

"archiveUrl": "archive-url"
"archiveDate": "archive-date"