Software developer for hire
User Details
- User Since
- Oct 16 2014, 12:59 AM (599 w, 4 d)
- Availability
- Available
- IRC Nick
- hare
- LDAP User
- Harej
- MediaWiki User
- Harej [ Global Accounts ]
Mar 11 2026
I have resumed testing on Japanese Wikipedia. Let me know if anything goes wrong. https://meta.wikimedia.org/wiki/InternetArchiveBot#Starting_and_stopping_the_bot has instructions for stopping the bot. In Japanese: https://meta.wikimedia.org/wiki/InternetArchiveBot/ja#%E3%83%9C%E3%83%83%E3%83%88%E3%81%AE%E5%A7%8B%E5%8B%95%E3%83%BB%E5%81%9C%E6%AD%A2
Per https://ja.wikipedia.org/w/index.php?title=Template:Cite_web&diff=next&oldid=100897193 and the comment in T317246 I think this task can be closed
Mar 4 2026
Dump has been downloaded. Thank you!
Mar 2 2026
My age public key is age1z938t9yu9secya4agylsfthg53pyyzumq39274pkpd0l0ypl0veq78g4l6
Feb 12 2026
Full database dump would be great if you have it.
Jan 26 2026
The redir1 redirect service long stopped working. I do not know why it stopped working.
Jan 18 2026
Dec 21 2025
Dec 13 2025
That isn't a problem with the data, it's a problem with the updater.
Dec 12 2025
It works! Thank you
Dec 8 2025
A Wikidata-only data.jnl file (gzip compressed) has been uploaded to my file server in 5 GB segments. It is available for download here: https://files.scatter.red/orb/2025/12/
Nov 2 2025
Jun 17 2025
While I got warnings, you are actually right that they are just warnings and it still works.
Jun 7 2025
May 10 2025
The Librarybase project is ongoing at https://librarybase.org. There is also a prototype database of citations appearing on Wikipedia at https://wikipediacitations.scatter.red. Otherwise I am not familiar with any progress on this kind of work, though it's something I am still interested in.
May 4 2025
Demo is done; this task can be closed at any time.
The demo has been done! This task can be closed any time.
May 3 2025
Apr 24 2025
Apr 22 2025
Apr 21 2025
I also added a function to purge expired cache at the beginning of the script call.
Added caching for:
- the Public Suffix List
- Individual lists of domains
- Added URLs for a given combination of old revision ID and new revision ID
- Whether a given URL matches is not cached, since that can change as the contents of the domain list changes. But whether a URL appeared in a given revision should not change (unless there was a bug in the URL extraction mechanism).
Apr 15 2025
Mar 12 2025
Mar 7 2025
Mar 6 2025
Feb 10 2025
Dec 3 2024
My input and the resulting error
Nov 14 2024
I now get this error:
[6aab8836cf02dc6764ce8a29] 2024-11-14 19:19:54: Fatal exception of type "Lcobucci\JWT\Signer\Key\FileCouldNotBeRead"
Oct 25 2024
Oct 24 2024
As part of solving this, we can store revision comparison data in browser cache. Given a combination of revisions, the URLs added are stored, and then screened against whatever copy of the list is loaded.
Still an issue as of 1.11
This should be resolved with the post-Wikimania rewrite.
I have two servers with 384 GB of RAM each, plus a workstation with 256 GB of RAM that can be used for testing. (I have tested QLever successfully on it before.) One of the servers has a bit over 2 TB free; the other has over 7 TB free. (Capacity can be added with funding.) I think either or both could be used to run QLever. What I would want to know is:
Oct 11 2024
No longer occurs in latest version of Citation Watchlist.
Alternative strategy to *not* do this actually gets better results.
Sep 19 2024
What I have learned, much to my delight, is that this API endpoint exists... for current revisions.
Aug 20 2024
Aug 13 2024
This task and T40265 should be merged since they seem to describe the same underlying problem, that if a character is treated as punctuation, it is not treated as part of the URL.
Aug 3 2024
Page seems to save fine after that, though now I can't test my function because doing so requires multiline input (that's a different problem).