Green_Cardamom (GreenC)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Jun 4 2016, 1:17 PM (62 w, 5 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Green Cardamom

Recent Activity

Yesterday

Green_Cardamom added a comment to T172737: Compare HTTP responses for NO RESPONSE FROM SERVER with external API.

Good timing as I just completed IMP yesterday (verified 4 million URLs in 2 months) and now back to running WaybackMedic on enwiki for the moment.

Wed, Aug 16, 4:48 PM · InternetArchiveBot (v1.5)

Tue, Jul 25

Green_Cardamom added a comment to T171023: {{webarchive}} and __FORMAT__.

Removed from enwiki but there are in other languages.

Tue, Jul 25, 2:24 PM · InternetArchiveBot (v1.4)

Wed, Jul 19

Restricted Application assigned T171057: Non-archive archive.org URLs in database to Cyberpower678.
Wed, Jul 19, 2:18 PM · InternetArchiveBot, Internet-Archive
Restricted Application assigned T171023: {{webarchive}} and __FORMAT__ to Cyberpower678.
Wed, Jul 19, 2:26 AM · InternetArchiveBot (v1.4)

Jul 13 2017

Green_Cardamom added a comment to T170413: Unable to update URL even though API reports success.

True because WebCite drops anything beyond ?url .. and theoretically it should work if it's "+" or "%20" in the query. But what if it is a site that is not flexible these ways.

Jul 13 2017, 8:43 PM · InternetArchiveBot

Jul 12 2017

Restricted Application assigned T170413: Unable to update URL even though API reports success to Cyberpower678.
Jul 12 2017, 12:17 PM · InternetArchiveBot

Jul 10 2017

Green_Cardamom added a comment to T170142: IABot API - truncates at %20 with modifyurl.

Some of them are not taking.

Jul 10 2017, 4:13 PM · InternetArchiveBot, Internet-Archive
Green_Cardamom added a comment to T170142: IABot API - truncates at %20 with modifyurl.

Ugh that's a small oversight. Fortunately it won't be difficult to go back and rerun all the ones that had a % in the URL. For some reason I was intentionally decoding the URL before encoding, I don't remember why, but probably works to do a single encoding with no pre-decode.

Jul 10 2017, 3:46 PM · InternetArchiveBot, Internet-Archive
Restricted Application assigned T170142: IABot API - truncates at %20 with modifyurl to Cyberpower678.
Jul 10 2017, 1:10 PM · InternetArchiveBot, Internet-Archive

Jun 26 2017

Green_Cardamom added a comment to T168794: Unable to manage URLs containing +.

Phew excellent.

Jun 26 2017, 11:30 PM · InternetArchiveBot (v1.4), Internet-Archive
Green_Cardamom added a comment to T168794: Unable to manage URLs containing +.

Yeah I've come across a lot like it.. would a db script change IDs? That is what IMP uses to track what it processed.

Jun 26 2017, 11:26 PM · InternetArchiveBot (v1.4), Internet-Archive

Jun 25 2017

Restricted Application assigned T168794: Unable to manage URLs containing + to Cyberpower678.
Jun 25 2017, 3:41 PM · InternetArchiveBot (v1.4), Internet-Archive

Jun 20 2017

Green_Cardamom updated the task description for T168351: v1.1 beta3 unable to update URLs.
Jun 20 2017, 2:42 AM · InternetArchiveBot (v1.4), Internet-Archive
Restricted Application assigned T168351: v1.1 beta3 unable to update URLs to Cyberpower678.
Jun 20 2017, 2:40 AM · InternetArchiveBot (v1.4), Internet-Archive

Jun 18 2017

Restricted Application assigned T168202: https:/// in database to Cyberpower678.
Jun 18 2017, 1:22 PM · InternetArchiveBot (v1.4)

Jun 15 2017

Green_Cardamom added a comment to T167605: Unable to override archive validation check in Tool/API.

Is v1.1beta3 live? How do I check the running version?

Jun 15 2017, 3:32 PM · InternetArchiveBot (v1.4), Internet-Archive
Green_Cardamom added a comment to T167512: IABot API wishlist.

Great thank you #1. For #2 the thinking is it needs to be able to track a link has been verified, most of the time it makes no change to the database because it verified OK. Being able to track this is important so it can go back and re-process the database to catch any new archive URLs that were added without re-processing the same URLs it already verified which is very time consuming. Unless they passed an expiration date and need to be re-verified thus the need for a date of verification. Maybe can track it locally not sure, what do you think. I thought if it was tracked in the IABot database anyone could then run IMP.

Jun 15 2017, 3:25 PM · InternetArchiveBot (v1.4)

Jun 12 2017

Green_Cardamom added a comment to T167512: IABot API wishlist.

Oh no! Well the more I think about #2 it might make more sense to track it local but still thinking about it. #4 is just an idea with no immediate application/need but could open possibilities. #1 a reverse lookup needed for debugging (needed it today for example). #3 will be key to saving links.

Jun 12 2017, 12:54 AM · InternetArchiveBot (v1.4)

Jun 11 2017

Green_Cardamom updated the task description for T167605: Unable to override archive validation check in Tool/API.
Jun 11 2017, 11:42 PM · InternetArchiveBot (v1.4), Internet-Archive
Green_Cardamom updated the task description for T167605: Unable to override archive validation check in Tool/API.
Jun 11 2017, 8:50 PM · InternetArchiveBot (v1.4), Internet-Archive
Green_Cardamom renamed T167605: Unable to override archive validation check in Tool/API from Unable to override archive validatio check in Tool/API to Unable to override archive validation check in Tool/API.
Jun 11 2017, 5:41 PM · InternetArchiveBot (v1.4), Internet-Archive
Green_Cardamom updated the task description for T167605: Unable to override archive validation check in Tool/API.
Jun 11 2017, 5:40 PM · InternetArchiveBot (v1.4), Internet-Archive
Restricted Application assigned T167605: Unable to override archive validation check in Tool/API to Cyberpower678.
Jun 11 2017, 5:39 PM · InternetArchiveBot (v1.4), Internet-Archive

Jun 9 2017

Restricted Application assigned T167512: IABot API wishlist to Cyberpower678.
Jun 9 2017, 2:08 PM · InternetArchiveBot (v1.4)
Green_Cardamom closed T167411: Protocol-relative URLs and SSL as Invalid.
Jun 9 2017, 2:07 AM · InternetArchiveBot
Green_Cardamom added a comment to T167411: Protocol-relative URLs and SSL.

Started a BOTREQ

Jun 9 2017, 2:07 AM · InternetArchiveBot

Jun 8 2017

Green_Cardamom added a comment to T167411: Protocol-relative URLs and SSL.

I started a discussion at Village Pump technical maybe it will answer some questions and/or lead to a bot that does conversions

Jun 8 2017, 3:15 PM · InternetArchiveBot
Green_Cardamom added a comment to T167411: Protocol-relative URLs and SSL.

Huh you're right it's forcing https. That's crazy. It breaks the whole point of PR and breaks many URLs.

Jun 8 2017, 3:00 PM · InternetArchiveBot
Green_Cardamom updated the task description for T167411: Protocol-relative URLs and SSL.
Jun 8 2017, 2:19 PM · InternetArchiveBot
Restricted Application assigned T167411: Protocol-relative URLs and SSL to Cyberpower678.
Jun 8 2017, 1:37 PM · InternetArchiveBot

Jun 5 2017

Green_Cardamom added a comment to T166928: Blank archiveurl/archivedate/url arguments in CS1|2 templates.

You're right I assumed that was VE,

Jun 5 2017, 1:49 PM · VisualEditor

Jun 3 2017

Green_Cardamom placed T166928: Blank archiveurl/archivedate/url arguments in CS1|2 templates up for grabs.
Jun 3 2017, 2:29 AM · VisualEditor
Green_Cardamom updated the task description for T166928: Blank archiveurl/archivedate/url arguments in CS1|2 templates.
Jun 3 2017, 2:25 AM · VisualEditor
Restricted Application assigned T166928: Blank archiveurl/archivedate/url arguments in CS1|2 templates to Cyberpower678.
Jun 3 2017, 2:22 AM · VisualEditor

Jun 1 2017

Green_Cardamom added a comment to T166791: Double {{webarchive}}.

In the first diff, it added the same {{webarchive}} twice, even though the source URLs are different.

Jun 1 2017, 4:25 PM · InternetArchiveBot (v1.4)
Restricted Application assigned T166792: Archive.is date format to Cyberpower678.
Jun 1 2017, 2:49 PM · InternetArchiveBot (v1.4)
Restricted Application assigned T166791: Double {{webarchive}} to Cyberpower678.
Jun 1 2017, 2:31 PM · InternetArchiveBot (v1.4)

May 26 2017

Green_Cardamom added a comment to T165504: Double wayback URL.

It's definitely a lot fewer - between the time of the fix and May 22 this is the only occurrence.

May 26 2017, 3:13 PM · InternetArchiveBot (v1.3)
Green_Cardamom reopened T165504: Double wayback URL as "Open".
May 26 2017, 2:29 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T165504: Double wayback URL.

It appears in 1.3.2 on May 21

May 26 2017, 2:29 PM · InternetArchiveBot (v1.3)

May 23 2017

Green_Cardamom added a comment to T165827: Truncated webcite URL.

I processed about 8000 webcite links randomly .. rechecking the API with the wikitext .. and found they mismatch in about 5% of cases, most of the time due to being truncated at the &. This is not an easy problem because the WebCite API is slow, and there are so many WebCite URLs in the database and wikitext. But I think this should be fixed somehow.

May 23 2017, 1:46 PM · InternetArchiveBot

May 22 2017

Green_Cardamom added a comment to T166008: Missing {{dead link}} tags.

It fixed 6 out of 8

May 22 2017, 2:02 PM · InternetArchiveBot (v1.3)

May 21 2017

Green_Cardamom closed T162722: Converting + to %20 as Resolved.
May 21 2017, 7:43 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Number 3 is a variant of 2 because when bare links are converted to wayback they get sanitized and if the wayback link stops working in the future one needs to extract the original URL from the wayback URL to search other services and since it was previously sanitized it may not be found at the new service. This is mostly true for archive.is since they crawled Wikipedia saving URLs as they were found at the time of the crawl.

May 21 2017, 7:43 PM · InternetArchiveBot (v1.3)
Green_Cardamom reopened T162722: Converting + to %20 as "Open".
May 21 2017, 3:27 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Sample ongoing bot wars over the %20/+ in query string.. not a complete list

May 21 2017, 3:27 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T166008: Missing {{dead link}} tags to Cyberpower678.
May 21 2017, 2:31 PM · InternetArchiveBot (v1.3)

May 20 2017

Green_Cardamom closed T165912: GreenC bot edit war: duplicate {{webarchive}} as Invalid.
May 20 2017, 5:53 PM · InternetArchiveBot
Green_Cardamom added a comment to T165912: GreenC bot edit war: duplicate {{webarchive}}.

Actually forget this.. found other edit warring cases and will open a new ticket later.

May 20 2017, 5:53 PM · InternetArchiveBot
Restricted Application assigned T165912: GreenC bot edit war: duplicate {{webarchive}} to Cyberpower678.
May 20 2017, 5:23 PM · InternetArchiveBot
Green_Cardamom closed T165827: Truncated webcite URL as Resolved.
May 20 2017, 1:56 PM · InternetArchiveBot
Green_Cardamom added a comment to T165827: Truncated webcite URL.

Ok .. found it manually by random check. What I'll do is verify them all in the next batch and see how many if any show up and determine by those numbers.

May 20 2017, 1:55 PM · InternetArchiveBot
Green_Cardamom added a comment to T165827: Truncated webcite URL.

Ok got it re: how to test for bug in code vs database.

May 20 2017, 1:48 PM · InternetArchiveBot
Green_Cardamom reopened T165827: Truncated webcite URL as "Open".
May 20 2017, 1:37 PM · InternetArchiveBot
Green_Cardamom added a comment to T165827: Truncated webcite URL.

Is there a way to tell when the URL was added to the database? I assumed it was just added by IABot but from what you're saying it was already in the database. This would help before reporting problems to determine if the link already existed in the database.

May 20 2017, 1:36 PM · InternetArchiveBot
Green_Cardamom closed T165826: bot:unknown in conversion from webarchive to citeweb as Invalid.
May 20 2017, 1:20 PM · InternetArchiveBot
Green_Cardamom added a comment to T165826: bot:unknown in conversion from webarchive to citeweb.

I see ok.

May 20 2017, 1:20 PM · InternetArchiveBot
Green_Cardamom added a comment to T165827: Truncated webcite URL.

I checked the WebCite API results and it looks correct,

May 20 2017, 2:38 AM · InternetArchiveBot
Restricted Application assigned T165827: Truncated webcite URL to Cyberpower678.
May 20 2017, 2:36 AM · InternetArchiveBot
Restricted Application assigned T165826: bot:unknown in conversion from webarchive to citeweb to Cyberpower678.
May 20 2017, 2:01 AM · InternetArchiveBot

May 16 2017

Green_Cardamom added a comment to T164048: Revert 225 bot edits adding https:///.

Fix done on svwiki.

May 16 2017, 10:53 PM · InternetArchiveBot (v1.3), Internet-Archive
Restricted Application assigned T165505: Adding |#= to cites to Cyberpower678.
May 16 2017, 3:32 PM · InternetArchiveBot
Restricted Application assigned T165504: Double wayback URL to Cyberpower678.
May 16 2017, 3:15 PM · InternetArchiveBot (v1.3)

May 13 2017

Green_Cardamom added a comment to T165164: IABot replacing plus sign with space (and thus breaking archiveurls).

WaybackMedic is fixing in the wikitext so no permanent damage. It's running a week or two behind IABot.

May 13 2017, 5:11 PM · InternetArchiveBot

May 12 2017

Green_Cardamom added a comment to T165153: arkivurl= should contain actual URL, not a Wikimedia-related test URL.

There are no other articles containing the link.

May 12 2017, 1:44 PM · InternetArchiveBot (v1.3), Internet-Archive
Restricted Application assigned T165156: _embed in wayback URL to Cyberpower678.
May 12 2017, 1:10 PM · InternetArchiveBot (v1.3), Internet-Archive
Restricted Application assigned T165155: Recursive cite to Cyberpower678.
May 12 2017, 12:56 PM · InternetArchiveBot

May 11 2017

Restricted Application assigned T165004: webcache (Google) and https to Cyberpower678.
May 11 2017, 1:30 AM · InternetArchiveBot (v1.3)

May 10 2017

Green_Cardamom added a comment to T164969: Bare links should not be archived during "Add archives to all URLs".

This is a design issue how to make the defaults. Do you want to default to method A or method B? It actually is confusing:

May 10 2017, 6:35 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164969: Bare links should not be archived during "Add archives to all URLs".

It's the same issue regardless. Why would anyone change a working live external link to an archive link? It's going to be the same problem no matter the language.

May 10 2017, 6:23 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T164972: Interface link back to Wikipedia to Cyberpower678.
May 10 2017, 6:21 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164969: Bare links should not be archived during "Add archives to all URLs".

You can click the other box to restrict to references but that should be the default because users are confused they need to check both boxes. There is rarely a case where you wouldn't click both boxes. The second box should be the opposite, to allow adding to all links not just references.

May 10 2017, 6:18 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T164969: Bare links should not be archived during "Add archives to all URLs" to Cyberpower678.
May 10 2017, 6:12 PM · InternetArchiveBot (v1.3)

May 9 2017

Restricted Application assigned T164874: Looping edits to Cyberpower678.
May 9 2017, 9:07 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T164872: Converting ':' to '%3A' to Cyberpower678.
May 9 2017, 9:04 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Is it going to modify the original URL in the wikitext as in the |url= field? That is the URL used to create snapshots at other archive providers and to find snapshots there. If the snapshot was created years ago, then the source URL is later modified, won't be able to find it (in cases where sanitization occurs).

May 9 2017, 8:58 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164839: Keep anchor.

It looks like the anchor is being stripped by the API which makes preserving more complex. Medic has the same issue.

May 9 2017, 5:46 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T164839: Keep anchor to Cyberpower678.
May 9 2017, 2:33 PM · InternetArchiveBot (v1.3)

May 8 2017

Green_Cardamom added a comment to T162722: Converting + to %20 .

That's a good solution.

May 8 2017, 6:45 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164778: Converting & to '&'.

WaybackMedic has been fixing these in the wikitext as of a few weeks ago. A handful show up in each batch run.

May 8 2017, 6:34 PM · InternetArchiveBot (v1.3)
Restricted Application assigned T164778: Converting & to '&' to Cyberpower678.
May 8 2017, 6:27 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

There are 20+ other archive services. In my experience they are not as flexible as Wayback when it comes to interchangeable interpretation of encoding.

May 8 2017, 6:11 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Seems straightforward to create rawurldecode-v2 that traps certain edge cases (+ in query) and passes the rest through rawurldecode(). This gives you total control over edge case problems that come up.

May 8 2017, 5:50 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Why does it need to sanitize the + and %20? These are problematic.

May 8 2017, 5:12 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Apparently it is sometimes significant on the Wayback machine (and always on archive.is). The problem might be Wayback Machine treats the entire URL as a path .. no query. Or maybe it depends, an older version of Wayback software did it that way and newer versions are able to differentiate but depends on how it got added to their database.

May 8 2017, 5:01 AM · InternetArchiveBot (v1.3)
Green_Cardamom reopened T162722: Converting + to %20 as "Open".
May 8 2017, 4:39 AM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T162722: Converting + to %20 .

Still happening v.1.3

May 8 2017, 4:39 AM · InternetArchiveBot (v1.3)

May 7 2017

Restricted Application assigned T164673: Ref mangled with certain garbage data input to Cyberpower678.
May 7 2017, 3:39 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

This one 200 even though unavailable

May 7 2017, 1:23 AM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

Sometimes other codes. For example this URL returns 503

May 7 2017, 1:20 AM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

A strategy I use to reduce bandwidth is pipe the request through head like this

May 7 2017, 1:07 AM · InternetArchiveBot (v1.3)

May 5 2017

Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

I know, this stuff is really complex. Don't expect error free :) Things are better now than before and keep improving is all that matters.

May 5 2017, 5:47 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

BTW other than this I can't find any other problem though it's difficult to test for the few tests I ran looked good.

May 5 2017, 5:21 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

I know, why I was curious it didn't work in that case. Was is because archive.fo instead of .is?

May 5 2017, 5:16 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

Yes it works as is. But every other service uses the 14-digit timestamp. The archive.is system is strange (a European-style syntax) and a point of complication for future bots and confusion among editors. Having a consistent format is a good idea considering it's so easy to do.

May 5 2017, 4:53 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

When it converts to long form it doesn't strip the "." and "-" from the 14-digit date

May 5 2017, 4:14 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

That's a good idea. They'll get re-added into the DB over time. Medic is adding them into the wikitext so bases are covered. Before though give me a little time to test the current IABot code -- I need to find some test cases that are problematic. Make sure its working right before starting over.

May 5 2017, 3:02 PM · InternetArchiveBot (v1.3)

May 4 2017

Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

The API will make a huge difference.

May 4 2017, 9:09 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

Good about the tool I'll check it out and test. It's still too many for manual work. Medic edits in about 1000 articles a day for the past 3 months (compared to IABot about 4000). Of all sorts of variety of problems. That's over 100,000 link changes of encoding, URL, date, deletions etc... that data needs to be fed into the database somehow. Medic is fixing tons of problems and its not getting reflected in the database so the errors keep getting replicated back into the wikitext, further amplified due to xwiki and one bad link in the database used in many articles.

May 4 2017, 8:46 PM · InternetArchiveBot (v1.3)
Green_Cardamom added a comment to T164172: Conversion of archive,is from short to long.

Sanitizer shouldn't remove fragments unless Wayback.

May 4 2017, 8:20 PM · InternetArchiveBot (v1.3)