Page MenuHomePhabricator

Re-implement functionality of old CommonsDelinker bot
Closed, ResolvedPublic

Description

The CommonsDelinker (https://commons.wikimedia.org/wiki/User:CommonsDelinker) delinks images from Wikimedia projects when these get deleted on Commons. No development has been happening on it and it's barely maintained. Wikidata is not supported yet, for this we have T66794. Lots of other parts are hacked up code, outdated code (pywikibot compat instead of core) or just plain broken. Maybe it's better to implement the functionality as a MediaWiki extension and completely drop the Delinker.

Basic workflow:

  • File gets deleted on Commons
  • Delinker goes over all usage and removes the file usage

Of course you could also add new use cases like restoring images when a file gets restored:

  • File gets deleted on Commons
  • Delinker goes over all usage and removes the file usage
  • File gets restored
  • Delinker reverts all the delinks of the file (probably only where delinker made the last edit)
  • Creating a log of delinks/restores

Other nice functionality could be the notification of projects that files are up for deletion or the ability to do a global replace of an image.

Event Timeline

Multichill updated the task description. (Show Details)
Multichill raised the priority of this task from to Lowest.
Glaisher added a subscriber: Glaisher.

Rather than working only for Commons/global files, it'd be useful to have this work for delinking local files locally as well.

Steinsplitter set Security to None.Jan 11 2015, 5:33 PM
Steinsplitter added subscribers: Didym, revi, Krd.
hoo added a subscriber: hoo.Jan 11 2015, 5:36 PM
hoo added a subscriber: Addshore.
Ricordisamoa updated the task description. (Show Details)Jan 11 2015, 6:13 PM
Fae awarded a token.Jan 11 2015, 6:17 PM
Fae added a subscriber: Fae.
Alan added a subscriber: Alan.Jan 12 2015, 2:46 PM

I don't think extensions should be editing pages, especially in cases like this one where the edits being made depend heavily or entirely on parts of the page's old content. I think we'd be better off with a rewrite of the bot.

Addshore added a comment.EditedJan 12 2015, 11:16 PM

One way this could possibly happen (Outside of creating some random system user) would be that when an admin on Commons deletes a file they have the option to de link all usages of the file (as the user that deleted the file).
This would also allow the extension to easily work in the way Glaisher mentioned above.

One way this could possibly happen (Outside of creating some random system user) would be that when an admin on Commons deletes a file they have the option to de link all usages of the file (as the user that deleted the file).

{{Strong oppose}} Wikis would be flooded with thousands of edits by random users. Having a single account that does the whole job makes checking their edits waaaay easier.

Well in theory these edits would not need to be checked as they are made by the extension, and they should be able to be marked as bot edits and basically ignored.

If there is a need to check the edits, maybe because we cant ensure they would be made correctly, then this probably shouldn't be an extension.

Well in theory these edits would not need to be checked as they are made by the extension, and they should be able to be marked as bot edits and basically ignored.

They are not ignored, for example, on the Italian Wikipedia: https://it.wikipedia.org/wiki/Wikipedia:Bot/Autorizzazioni/Archivio/2010#CommonsDelinker

If there is a need to check the edits, maybe because we cant ensure they would be made correctly, then this probably shouldn't be an extension.

This is one of the reasons I wouldn't be very comfortable with a CommonsDelinker extension.

MichaelMaggs added a subscriber: MichaelMaggs.

If there is disagreement to create a extension, at least the bot needs a rewrite. I have the feeling that the bot dies soon (big error log, i restarted the bot a dozent of times in the last weeks *sigh)

Magnus added a subscriber: Magnus.Jan 14 2015, 1:42 PM

I'll have a go at a rewrite on labs tools. Thoughts:

  • Probably PHP, unless anyone objects :-) Any good, current Wiki editing libraries out there, or do I have to roll my own?
  • I couldn't find any examples of the "warning" feature. Is that still used?
  • This seems like an opportunity to do "local wiki rescuing" of files to be deleted. Should this be bundled? Would it get too complicated (e.g. no fair use on dewiki etc.)?
  • Anyone interested in maintaining is welcome!

I crated User:CommonsDelinker/Rewrite to track informations & describe the tool.

I'll have a go at a rewrite on labs tools. Thoughts:

  • Probably PHP, unless anyone objects :-) Any good, current Wiki editing libraries out there, or do I have to roll my own?

No objections. Happy to hear that someone is thinking about rewriting delinker :-)

The only stable PHP bot framework i know is https://en.wikipedia.org/wiki/Wikipedia:Peachy - not sure if peachy will be maintained for ever :/

Addshore added a comment.EditedJan 14 2015, 4:46 PM

@Magnus if your using php checkout https://github.com/addwiki specifically the mediawiki-api lib
If there is anything missing there that you want to be able to do just drop me an issue or an email and I'll implement it, or submit a pull request!
Peachy is messy, but could work.

I'll have a go at a rewrite on labs tools. Thoughts:

Are you sure you want to load this burden onto yourself?

  • Anyone interested in maintaining is welcome!

We already tried that. If you write something new, nobody will help you and you're on your own. Don't say I didn't warn you.
Maintenance is exactly the reason why I'm suggesting to make it into an extension.

While I really don't desperately need more workload, I am running >100 tools at the moment; after the initial setup, one more hopefully won't make much of a difference. The Foundation won't pick this one up with a long stick, but they are preparing to take son-of-WDQ off my hands, which should give me some wiggle room. Worst case, we'll end up with a slightly more current, unmaintained codebase...

Finding volunteers to help with long-term maintenance is always a problem, especially with PHP. If you were to use the pwb framework it would probably be easier in the long term to get people involved, given the increasing popularity of Python. But of course better PHP than no tool at all, and the person doing the coding gets to choose.

I don't think extensions should be editing pages, especially in cases like this one where the edits being made depend heavily or entirely on parts of the page's old content. I think we'd be better off with a rewrite of the bot.

I think this ship has sailed. MassMessage does it, Wikibase does it, Flow did it (but won't going forward), and maintenance scripts do it all the time

Anyways, count me in the "implement it as an extension" camp, and I'm willing to help with CR and get it through the deployment process, but lack time to actually work on it.

Also https://gerrit.wikimedia.org/r/#/c/151371/ was my attempt to convert it from compat to core, except I never really figured out how to run it so I could test my changes. Anyone is free to pick that up :)

Legoktm raised the priority of this task from Lowest to High.Jan 14 2015, 10:13 PM

Update: Experimental code can detect newly deleted files, find pages to unlink from, write to tool database. And, Wikidata unlink:
https://www.wikidata.org/wiki/Special:Contributions/CommonsDelinquent

  • Probably PHP, unless anyone objects :-) Any good, current Wiki editing libraries out there, or do I have to roll my own?

I object :-) The existing and *cough* working *cough* codebase is written in Python, and Pywikibot is by far the most complete and up-to-date framework for wiki bots.

  • This seems like an opportunity to do "local wiki rescuing" of files to be deleted. Should this be bundled? Would it get too complicated (e.g. no fair use on dewiki etc.)?

That's an interesting feature, but it should be enabled on a wiki-by-wiki basis, and it seems rather low-priority.

  • Anyone interested in maintaining is welcome!

Maintaining is quite a big word, but I think the Pywikibot team could have a look at the code once in a while.

Well, the current Delinker is written in Python, and this bug exists because no one is maintaining it, so Python is actually the one language we know for a fact will not be maintained ;-)

Anyway, I've started in PHP, and it looks good so far. The addwiki framework works well for my purposes so far.

If someone wants to seriously maintain the Python Delinker, please let me know so I don't waste more time writing a replacement that will never be used, and close this bug. Otherwise...

Steinsplitter added a comment.EditedJan 15 2015, 12:44 PM

Maintaining is quite a big word, but I think the Pywikibot team could have a look at the code once in a while.

Really? ... I am trying since one year to find a maintainer (which is familiar with pywik) but without success. -sigh-

To be honest, i have spend a lot of time to find someone from the Pywikibot team in the last year..... I also spend a lot of time to find someone to move delinker to toolslabs .

Steinsplitter added a comment.EditedJan 15 2015, 12:50 PM

If someone wants to seriously maintain the Python Delinker, please let me know so I don't waste more time writing a replacement that will never be used, and close this bug. Otherwise...

Agree, If someone like seriously maintain the Python Delinker let us know now. If no dev like to maintain the Python Delinker then i go ahead and grant magnus access to delinker on toolslabs.

Fae added a comment.Jan 15 2015, 1:50 PM

Steinsplitter, why not go ahead. We all rely on Magnus' best judgement anyway.

Tangent:
It is interesting seeing varied points of view about our volunteer programming community here (this is "we"). I have played with various stuff, at this moment I use pywikibot and Python to fiddle around and occasionally do something interesting, however I have no strong views on whether relying on Python based services it is better or worse in terms of long term maintenance that knocking them out in PHP, they are just scripting languages and our community should remain as agnostic as possible.

I suggest we avoid imagining independent unpaid volunteers behave like development "teams" as we mostly act alone and asynchronously. We sometimes act collegiately but avoid stepping on each others toes, and our motivations vary wildly as well as our free time, skills and temperament. This is more tricky than herding cats, and any would be shepherd would be making a mistake to think we are even the same animal. I'll ponder where to raise this with the wider community as I see little in the way of WMF sourced funding being put aside apart from non-global chapter-type events, with comparatively little measurable impact on maintainability of volunteer lead services or repeatability of volunteer created innovation beyond the budget year.

In T86483#979433, @Fae wrote:

Steinsplitter, why not go ahead. We all rely on Magnus' best judgement anyway.

Agree :). Added Magnus to tools-delinker... so he can access to the delinker passwd.

My bot is now in manual testing (no cronjob), with initial interface:
https://tools.wmflabs.org/commons-delinquent/

Meanwhile, does anyone know about this bot:
https://en.wikipedia.org/wiki/User:Filedelinkerbot

Yay bot wars!

Meanwhile, does anyone know about this bot:
https://en.wikipedia.org/wiki/User:Filedelinkerbot

It is a bot specially written for english wikipedia, it exists on dewiki too, I talked with @Krd a while ago and he sayed to me that the bot is written for de and enwiki and he dos not like to operate it globally.

Steinsplitter added a comment.EditedJan 16 2015, 7:04 PM

My bot is now in manual testing (no cronjob), with initial interface:

When your bot does the most importantst things, can you please move it then to the CommonsDelinker account? (Der CommonsDelinker account wäre halt schon bekannt und auf sehr vielen wikis als bot gefaggt, das pass müste im folder liegen,)

Thanks again for working on the bot!!

@Magnus Is it possible to extend the functionality for local (non-Commons) files?

Krd removed a subscriber: Krd.Jan 17 2015, 9:56 AM
Multichill added a subscriber: Multichill.

Delinked myself from the tools account. Good luck.

Ahonc added a subscriber: Ahonc.Jan 17 2015, 4:26 PM

I suggest that also need option when there is temporary deletion of image (for example for history merging or splitting). In such case Delinker should not delink images.

I suggest that also need option when there is temporary deletion of image (for example for history merging or splitting). In such case Delinker should not delink images.

:Waiting 4 (?) mins before delinking should be enough.

Alan added a comment.EditedJan 17 2015, 5:03 PM

Bug :-(

https://he.wikipedia.org/wiki/%D7%A9%D7%99%D7%97%D7%AA_%D7%9E%D7%A9%D7%AA%D7%9E%D7%A9:CommonsDelinquent

Hi, edits made by this bot in hewiki removes the image name and not the whole image link (e.g [[File:IMAGENAME|thumb|image desc]] instead of [[File:IMAGENAME<nowiki>|thumb|image desc]]). For example: [1][2]. Please ask to unblock the account once the softwarwe get fixed (you can do it with {{בקשת שחרור חסימה}}). ‎ ערן - שיחה 16:51, 17 בינואר 2015 (IST)

Example: https://he.wikipedia.org/w/index.php?title=%D7%A4%D7%95%D7%A8%D7%98%D7%9C:%D7%90%D7%99%D7%A9%D7%99%D7%9D/%D7%94%D7%99%D7%95%D7%9D_%D7%91%D7%94%D7%99%D7%A1%D7%98%D7%95%D7%A8%D7%99%D7%94/14_%D7%91%D7%9E%D7%A8%D7%A5&diff=prev&oldid=16448970

MichaelMaggs renamed this task from Implement delinker functionality as a MediaWiki extension to Re-implement functionality of old CommonsDelinker bot.Jan 17 2015, 6:49 PM

Should we merge T66794 into this?

Increased delay time to 4 min after deletion, and fixed the he.wp bug (I think).

I have switched my rewritten bot to the CommonsDelinker account. I am also running the new bot continuously now.

The new code doesn't do file replacement or category replacement yet, so I suggest we keep the old bot running until I get that working.

Alternatively, if someone from the "old crew" can turn the delinking in the old bot off but keep the replacement part running, that would even be better.

I have beefed up the new bot interface a little: https://tools.wmflabs.org/commons-delinquent/

Looks lovely Magnus!!!!!

@Magnus: Lovely! Your work is highly appropriated :)

The new code doesn't do file replacement or category replacement yet, so I suggest we keep the old bot running until I get that working.

We have a bot for category replacement. So you need ony implent the global file replacement :)

Yes, thanks Magnus. You've made a lot of people very happy!

File replacement in the new code has been tested and appears to work.

I suggest the old Delinker to be turned off now. It might also help to prominently link from the old to the new interface.

Is there an urgent need for the old database to be imported into the new one? I haven't looked at the old DB, so no idea how complicated/lossy that would be. As far as I'm concerned, we can leave the old interface running to allow queries on the old DB.

Current code has been pushed to bitbucket. Barring the unavoidable small bugs that are bound to crop up over time, I think this bug can be closed.

Finally, I am currently the only maintainer on the commons-delinquent tool. I'd be happy to add others, if only to restart the bot if need be and I'm unavailable.

I suggest the old Delinker to be turned off now. It might also help to prominently link from the old to the new interface.

I removed (#) the crontabs.

Is there an urgent need for the old database to be imported into the new one? I haven't looked at the old DB, so no idea how complicated/lossy that would be. As far as I'm concerned, we can leave the old interface running to allow queries on the old DB.

Not highly urgent.

Finally, I am currently the only maintainer on the commons-delinquent tool. I'd be happy to add others, if only to restart the bot if need be and I'm unavailable.

I changed the userpages on commons, enwiki and meta so that the users can see that you are now the new maintainer :) (can pls add me? toolslabs is not 100% stable, so sometimes webservice etc. needs a restart)

Fixed in common.js : https://commons.wikimedia.org/w/index.php?title=MediaWiki:Common.js&diff=147115387&oldid=146778868

As backtracker using https://bitbucket.org/magnusmanske/commons-delinquent/issues ?

@Magnus: A last question: https://commons.wikimedia.org/w/index.php?title=User:CommonsDelinker/commands&diff=147110803&oldid=147110544 CommonsDelinkerHelper is flagged as +sysop to remove requests. If i see your code correctly, CommosDelinker is doing that. Schould i ask to move the +sysop flag from commonsdelinkerhelper to commonsdelinker? Or schould the page be protected using abusefilter?

Well, the page is protected, and CommonsDelinker can edit it, so I assume it is already +sysop. Is there a need for CommonsDelinkerHelper anymore?

Steinsplitter added a comment.EditedJan 19 2015, 4:11 PM

Well, the page is protected, and CommonsDelinker can edit it, so I assume it is already +sysop. Is there a need for CommonsDelinkerHelper anymore?

CommonsDelinker can't edit it (the edit was before i removed the crons). I posted a request at BN https://commons.wikimedia.org/w/index.php?title=Commons%3ABureaucrats%27_noticeboard&diff=147118501&oldid=147117849 to fix this.

Steinsplitter closed this task as Resolved.Jan 19 2015, 4:14 PM
Steinsplitter claimed this task.
Steinsplitter reassigned this task from Steinsplitter to Magnus.
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJul 8 2015, 2:07 PM