IABot: new domain for webrecorder.io (conifer.rhizome.org)
Open, MediumPublicBUG REPORT
Actions

Assigned To

None

Authored By

	Micler
	Oct 19 2022, 12:53 AM

Description

Steps to replicate the issue (include links if applicable):

In en.wikipedia.org, save a page with {{cite web |archive-url=https://conifer.rhizome.org/...rest-of-url...}}.
- Example: https://en.wikipedia.org/w/index.php?diff=1112536795 with archive-url https://conifer.rhizome.org/micler/my-public-collection/20220926195305/https://www.rferl.org/a/minsk-gunfight-kgb-video/31484513.html
Wait a while for InternetArchiveBot to visit the page

What happens?:

InternetArchiveBot considers the archive-url invalid and overwrites it. Example: https://en.wikipedia.org/w/index.php?diff=1116708092

What should have happened instead?:

InternetArchiveBot should recognize URLs containing "conifer.rhizome.org" to be valid archive URLs, with the same handling as "webrecorder.io", just a different domain name. The Webrecorder.io tool was renamed to Conifer and moved to this new domain in 2020. Announcement posts about it: (1) (2) (3)

https://webrecorder.io redirects to the new site, so that domain should continue to be allowed too.

I have verified that the existing regex in resolveWebRecorderURL() will work with just a change to the domain. (https://github.com/internetarchive/internetarchivebot/blob/60a2f488f5cacae662de93a7ae031e01b2a76cd4/app/src/Core/APII.php#L4082)

Software version (skip for WMF-hosted wikis like Wikipedia):

N/A - en.wikipedia.org

Other information (browser name/version, screenshots, etc.):

You might wonder, why I would bother with this archiving service. Tl;dr it can deal better with dynamic pages because it captures a WARC from a browser session. In my particular edit (https://en.wikipedia.org/w/index.php?diff=1112536795) I wrote a comment explaining that web.archive.org did not work.

Thanks for your help!

Related Objects

Duplicates Merged Here: T330342: Add Conifer to archive providers

Event Timeline

Micler created this task.Oct 19 2022, 12:53 AM

Restricted Application added a project: Internet-Archive. · View Herald TranscriptOct 19 2022, 12:53 AM

Until resolved, you can add {{cbignore}} to keep the bot off the citation. That's what I did for the few I added. BTw ghostarchive.org uses the same Webrecorder technology on the backend. It won't work in every case as a substitute for Conifer, but many it will.

Harej moved this task from Inbox to Backlog: URLs on the InternetArchiveBot board.Oct 19 2022, 3:18 PM

Aklapper removed a subscriber: InternetArchiveBot.Oct 21 2022, 12:38 PM

Harej triaged this task as Medium priority.Nov 14 2022, 9:46 PM

Harej removed a project: Internet-Archive.Nov 18 2022, 12:24 AM

Basically what we need to do is set up conifer.rhizome.org as an alternate domain for webrecorder.io (or vice versa). We've done this for other archive providers so not a huge deal. Noting this glitched edit that was reported: https://en.wikipedia.org/w/index.php?title=Tennis_Masters_Series_singles_records_and_statistics&diff=1137735504&oldid=1137068353&diffmode=source

Harej merged a task: T330342: Add Conifer to archive providers.Feb 27 2023, 7:49 PM

IABot: new domain for webrecorder.io (conifer.rhizome.org)Open, MediumPublicBUG REPORTActions

Description

Related Objects

Event Timeline

IABot: new domain for webrecorder.io (conifer.rhizome.org)
Open, MediumPublicBUG REPORT
Actions