Page MenuHomePhabricator

Improving Magnus' tools (tracking)
Closed, DeclinedPublic

Description

Standardize deployment

Patch contributors want their changes to go live as soon as possible.

Apply consistent code conventions

Different styles may cause headache in the long run.

Write tests and turn CI on

Decreasing chances of accidentally breaking something would greatly encourage occasional contributors.

Remove dead code

To make easier to see at a glance what is being run and what isn't.

Document how to set up a development environment

So that patch contributors can test their changes locally before submitting them for review.

Use detailed commit messages

Too many "misc"s, don't you think?

Split reusable components into standalone libraries

Not to reinvent the wheel!

Add internationalization (T110161)

Essential to reach a broader audience.

Do a security review

Many people rely on Magnus' tools.

Event Timeline

Ricordisamoa raised the priority of this task from to Needs Triage.
Ricordisamoa updated the task description. (Show Details)
Ricordisamoa added a project: Community-Tech.
Ricordisamoa subscribed.
DannyH renamed this task from How can we improve Magnus' tools? to Improving Magnus' tools (tracking).Oct 28 2015, 7:45 PM

I'm going bold and pinging @Magnus who might be interested.

Do we want moving some of his tools into production as a goal to aspire to?

I'm going bold and pinging @Magnus who might be interested.

Hi @Ricordisamoa, if this is a project you'd like the Community Tech team to undertake, I suggest you add it to the Community Wishlist Survey as that's our main channel for taking on tasks for now. It ensures the community is on board and in approval of what we have on our roadmap. Thanks!

Do we want moving some of his tools into production as a goal to aspire to?

Some of them might benefit from production-level resources, but no tool shall be 'taken away' from Magnus' ultimate control within this task.

In T115537#1813005, @NiharikaKohli wrote:

Hi @Ricordisamoa, if this is a project you'd like the Community Tech team to undertake, I suggest you add it to the Community Wishlist Survey as that's our main channel for taking on tasks for now. It ensures the community is on board and in approval of what we have on our roadmap. Thanks!

The scope of this task is very broad (I tagged it Tracking-Neverending on purpose), definitely too broad for it to be actionable as a whole: thus it is not listed in the Community Wishlist Survey. Nevertheless, I believe this is precisely the kind of task the Community Tech team — and, more generally, anyone involved in the poorly supported technical side — should be dealing with.
Thanks for your interest.

Finally, someone realized just how much shoddy code I deployed! :-)

Some thoughts on the individual points:

  • Standardize deployment: I do try to apply pull requests in as timely a fashion as I can. AFAICT, there are currently no pull requests pending on BitBucket.
  • Apply consistent code conventions: I do try to use a consistent style across my code, which is some K&R bastardisation with lots'o'spaces, but few comments ;-)
  • Write tests and turn CI on: In my space, tests seem to be more trouble than they are worth. Is the category tree correct? Well, it changes several times a day, so what is correct? For all but the most complex tools (e.g. CatScan2), users will complain quickly if I break something, so there's my test! Not sure how Continuous integration would factor in here; I don't debug often, but when I do, I do it in production!
  • Remove dead code: Gladly, if I can remember what code is obsolete...
  • Document how to set up a development environment: Yup. That would likely involve either running on Labs and using my shared code, or duplicating that. As many of my tools depend on Labs databases, the former is recommended.
  • Use detailed commit messages: I don't always remember to commit a change. Then when I do, I'm not sure what the old changes were. Hence, "misc".
  • Split reusable components into standalone libraries: I do that! Check out /data/project/magnustools/public_html/php and /data/project/magnustools/public_html/resources! common.php, common.js, wikidata.js, and (increasingly) wikidata.php are used a lot. I also use a single index.html for most of my tools. Of course, some of these are cruft magnets, but I can't remove "dead code", because I have no idea which of it is used by which tool.
  • Do a security review: Sounds sensible. Then again, many of my tools just run some read-only queries on WMF databases, so little harm can be done here. Some tools do store user-provided contents, like mix'n'match or the Wikidata games. Those might warrant a closer look.

As for "productizing" some of my tools - I'd be very happy if that happened! I don't see it as being "taken away" from me, rather as the tool getting quality improvements and (paid) support!
In fact, I did ask for some of my tools to become "production", even if that means a total rewrite. I am so reliefed for the Wikidata SPARQL service taking off, as this means I might be able to retire WDQ at some point.
One less creaking time sink to worry about :-)

As for "productizing" some of my tools

I think when we are talking about that, it is essential to not only look at an 'end product', but also to see if we can identify parts that are 'common' to tools and that we can build services and apis around so that more tools can be created. We need to find the pieces that we can turn into lego bricks so people's dream castle projects can be build more easily. Look for instance at the wikidata game. There are definitely building blocks in there which could be used in a more general fashion for workflows that are not wikidata specific. There are general API concepts, services and UI tooling there that could be abstracted and then used by a dozen similar systems to do tasks on wiki's !!!

I think that is much more important. We cannot productize every single major service, and in my opinion there isn't actually much of a reason to do so. But we can productize the common parts of many of such tools and make it easier to create many similar tools for many many niche editor markets. I hoped WikiGrok was going to make this clear, but it seems to have been too much focused on the edges, and not enough on the building blocks.

There are certainly some "code blocks" that could be abstracted; however, I think we should be careful not to create Yet Another Wikipedia/Wikidata PHP library, one that would likely be bound to Labs at that.

Another approach to "building blocks" are lists of pages/items that can be exchanged between tools. I have made several efforts in that regard, PagePile being the latest one. There is a PHP library for that, programmer's intro, some code documentation etc. It is available in amny of my tools, but so far, no one else has picked it up.

Another approach to "building blocks" are lists of pages/items that can be exchanged between tools. I have made several efforts in that regard, PagePile being the latest one. There is a PHP library for that, programmer's intro, some code documentation etc. It is available in amny of my tools, but so far, no one else has picked it up.

PagePile is a good example of something I'd like to see built into MediaWiki: we need good tools for the curation of pages for editorial purposes. (Gather, as I gather, is for curation of pages for readers.) This would be useful for such things as WikiProjects.

PagePile is exactly the type of the element that I was thinking about in my comments.

Of course I bow to Magnus for his astounding contributions to the movement. My words shall by no means be intended as criticism of him or his deeds. On the contrary, my aim is to streamline maintenance of ubiquitous tools in need of love.

No offense was taken :-)

I am very well aware that my code is not exactly up to industry standard. Anything to improve that, within my time bandwidth, is welcome.

Finally, someone realized just how much shoddy code I deployed! :-)

Your tools are constantly praised and always to the point: that's not quite common in Wikimedia nowadays :-)

  • Standardize deployment: I do try to apply pull requests in as timely a fashion as I can. AFAICT, there are currently no pull requests pending on BitBucket.

I'm very glad you're being timely on merging pull requests, but I was referring to the time they take to hit production: for example, this change has been accepted weeks ago but does not appear to have made it to wdq.wmflabs.org as I write.

  • Apply consistent code conventions: I do try to use a consistent style across my code, which is some K&R bastardisation with lots'o'spaces, but few comments ;-)

Caught red handed! One line like many others...

  • Write tests and turn CI on: In my space, tests seem to be more trouble than they are worth. Is the category tree correct? Well, it changes several times a day, so what is correct? For all but the most complex tools (e.g. CatScan2), users will complain quickly if I break something, so there's my test! Not sure how Continuous integration would factor in here; I don't debug often, but when I do, I do it in production!

Unfortunately this could discourage new contributors from refactoring code.

  • Remove dead code: Gladly, if I can remember what code is obsolete...

Likely not high priority, but profiling might be of use.

  • Document how to set up a development environment: Yup. That would likely involve either running on Labs and using my shared code, or duplicating that. As many of my tools depend on Labs databases, the former is recommended.

Integrating a Labs-like environment into Vagrant could be a step forward.

  • Split reusable components into standalone libraries: I do that! Check out /data/project/magnustools/public_html/php and /data/project/magnustools/public_html/resources! common.php, common.js, wikidata.js, and (increasingly) wikidata.php are used a lot. I also use a single index.html for most of my tools. Of course, some of these are cruft magnets, but I can't remove "dead code", because I have no idea which of it is used by which tool.

Unfortunately that's far from intuitive for local development.

Finally, someone realized just how much shoddy code I deployed! :-)

Your tools are constantly praised and always to the point: that's not quite common in Wikimedia nowadays :-)

:-)

  • Standardize deployment: I do try to apply pull requests in as timely a fashion as I can. AFAICT, there are currently no pull requests pending on BitBucket.

I'm very glad you're being timely on merging pull requests, but I was referring to the time they take to hit production: for example, this change has been accepted weeks ago but does not appear to have made it to wdq.wmflabs.org as I write.

Oh, this is on production already. I can see it in the file on both WDQ machines. But it doesn't show in the web site. Apparently, some caching issue. Weird. But the code is live, and has been for a while.

  • Apply consistent code conventions: I do try to use a consistent style across my code, which is some K&R bastardisation with lots'o'spaces, but few comments ;-)

Caught red handed! One line like many others...

Ah. OKKKKK... if that's bothering people...
Do we have some code formatting tool installed on Labs? 'cause I'm not going to fix all of those manually...

  • Write tests and turn CI on: In my space, tests seem to be more trouble than they are worth. Is the category tree correct? Well, it changes several times a day, so what is correct? For all but the most complex tools (e.g. CatScan2), users will complain quickly if I break something, so there's my test! Not sure how Continuous integration would factor in here; I don't debug often, but when I do, I do it in production!

Unfortunately this could discourage new contributors from refactoring code.

But unless those new contributors save me more work than writing tests in the first place...

  • Remove dead code: Gladly, if I can remember what code is obsolete...

Likely not high priority, but profiling might be of use.

  • Document how to set up a development environment: Yup. That would likely involve either running on Labs and using my shared code, or duplicating that. As many of my tools depend on Labs databases, the former is recommended.

Integrating a Labs-like environment into Vagrant could be a step forward.

SEP :-)

  • Split reusable components into standalone libraries: I do that! Check out /data/project/magnustools/public_html/php and /data/project/magnustools/public_html/resources! common.php, common.js, wikidata.js, and (increasingly) wikidata.php are used a lot. I also use a single index.html for most of my tools. Of course, some of these are cruft magnets, but I can't remove "dead code", because I have no idea which of it is used by which tool.

Unfortunately that's far from intuitive for local development.

Well, most of the PHP code will be unusable on non-Labs environments anyway, at least the parts that deal with databases.

Here's a thought: It might be worth picking one tool and go through all the motions - tests, comments, whitespace, etc - and see how much effort is involved. Any candidates that come to mind?

I'm very glad you're being timely on merging pull requests, but I was referring to the time they take to hit production: for example, this change has been accepted weeks ago but does not appear to have made it to wdq.wmflabs.org as I write.

Oh, this is on production already. I can see it in the file on both WDQ machines. But it doesn't show in the web site. Apparently, some caching issue. Weird. But the code is live, and has been for a while.

Apologies.

Caught red handed! One line like many others...

Ah. OKKKKK... if that's bothering people...

I'm always puzzled about which style to use for a patch, you know...

Do we have some code formatting tool installed on Labs? 'cause I'm not going to fix all of those manually...

https://tools.wmflabs.org/stylize/ might come handy but I'm happy to fix all of them if you point me to your styleguide :-)

Unfortunately this could discourage new contributors from refactoring code.

But unless those new contributors save me more work than writing tests in the first place...

Got it :-)

Tool Labs volunteer Tim Landscheidt (@scfc) has published a list of suggestions for tool developers and maintainers. Especially #3 and #4 sound relevant to this task.

No open tasks are being tracked in this Tracking-Neverending task. Can this task be closed nowadays? If not, which specific subtasks are missing?

No open tasks are being tracked in this Tracking-Neverending task. Can this task be closed nowadays? If not, which specific subtasks are missing?

No reply; Boldly closing.