Page MenuHomePhabricator

What makes a high quality MediaWiki extension? - Hackathon session
Closed, ResolvedPublic

Description

What makes a high quality MediaWiki extension nice?

This will be a roundtable kind of session where we discuss the things we look for in other MediaWiki extensions while reviewing them that are positive and negative. This will then be turned into a checklist of sorts that developers can look towards for advice on how to make a high quality MediaWiki extension.

Some examples of easy list items might be:

  • Having PHPUnit tests
  • Having a Help:Extension:Foo page on mediawiki.org
  • Using the standard directory layout of resources/ includes/ i18n/ etc.
  • ...

Even if you aren't at the hackathon in person, please contribute more suggestions in the comments.

Event Timeline

I've scheduled this for Thursday at 10am, and announced it on wikitech-l.

  • Has a clear separation of concerns between what it actually does, and how its presented to the user
  • Provides all the same functionality over API and web, without duplicating implementation
  • Avoids global state

[non-developer view]

  • It reads and has into account global mediawiki configuration such as read-only mode
  • It works well in a distributed environment (concurrency, multiple databases, clustering)
  • If it needs persistance, it creates nice SQL (primary keys, indexes where needed) and uses some caching mechanism where/if necessary
  • It is easy to debug (for example, it throws reasonable logs at different levels, and exceptions only for situations it should not handle)
  • Documentation (a good introduction of what it does, how to install it -if it is non trivial-, how to configure it, how to uninstall it)
  • Good use of the MediaWiki framework (e.g. using framework provided functions instead of native PHP or external library functions)

Btw.: Maybe some names of extensions that are considered to be "good ones" could be posted here.

A good extension name, that explains intuitively what it does and avoids generic words

On its wiki page:

  • Quick explanation of what it does (you'd be surprised how many fail this)
  • Clear description of exactly which MW versions it works with
  • Complete list of any dependencies and how to install them
  • Compare and contrast with similar extensions
  • All variables described in one place, from most-used to most-obscure
  • Add to the appropriate extension categories

In the bundle:

  • A README file that summarized the docs and gives detailed installation directions
  • Prominent $VERSION indication in the README and main php file (ideally all files)
  • Use standard files names (e.g. LICENSE not COPYING)

This is an interesting topic. During the hack-a-thon I did a session on "building better software" and one of the topics discussed was along these lines. We had some 3rd party users/developers of MediaWiki that wanted to see if there was a way to "rate" extensions based on certain criteria like your describing. Very interested in the outcome of this session (not there in person unfortunately).

From @Bawolff:

  • Can I read the code with reasonable effort? Is it convoluted / unnecessarily complex?
  • MW coding conventions: variable names use camel case
    • Does it run PHPCS/MW-CS
  • parser functions/tags need parser tests
  • use MW functionality/wrappers for things like WebRequest vs $_GET. etc. Don't reinvent MW
  • if adding database tables, it should use the update.php hook
  • Don't disable OutputPage
  • Don't disable parser cache unless you have a really good reason
  • Know when you should use ParserOutput methods vs. same methods in OutputPage

For security specifically:

  • Database access should always use database abstraction layer, and not open us up to SQLI
  • Shelling out should escape arguments
  • Does it use MW's validation/sanitization methods e.g. HTML class, htmlspecialchars, etc.
  • i18n escape as close to output as possible. document whether functions take/except wikitext vs HTML
  • Don't touch HTML after it has been sanitized (common pattern is to use regex, but that's bad)
  • All write actions should be protected with CSRF

More to come...

From @TheDJ, @Krenair, @Quiddity :

  • Organized: consistent naming, directory structure that is logical and not messy
    • not having a giant root level directory
  • Make sure it's not English Wikipedia-specific or Wikimedia specific
  • Don't hardcode wikicode / templates or stuff, especially in a way that's not configurable for other websites
  • Don't reimplement MediaWiki
  • Don't reimplement libraries that exist, don't reinvent the wheel
  • Code should be readable by someone who is familiar in that area
  • skin and extension functionality should not be tightly integrated.
  • Don't call out to static functions in an unrelated class because it wasn't refactored or thought out well
  • id's should only be for things you can navigate to. Everything else should be a class. Even if you want to only use it once, just use a unique class
  • Use HTML elements for their intended purpose. Aka use a <button> and not a <div> with a click handler for a11y.
  • In JS especially, don't dump everything into one function. If you're indented off the screen, you're doing it wrong
  • Have co-maintainers!
  • Don't load external resources for privacy and performance
  • Need a LICENSE/COPYING file
  • Documentation should have screenshots in multiple languages, ideally one in RTL
  • Documentation should discuss some edge cases that were tested
  • Consistently use <pre>, <code>, <source> in documentation
  • Test against RTL languages!
  • Don't have hardcoded non-translatable strings in your code, use the proper l10n functions (wfMessage)
  • Use the normal MediaWiki bug tracker / code review systems
  • Use code comments to document why you do things, not what you do. In long blocks of code, adding comments stating what each paragraph does is nice for easy parsing, but generally, comments should focus on the questions that can't be answered by just reading the code.
  • Use PSR-4: one class per file, file name/path reflects class name. Classes should preferably be in the MediaWiki\Extensions\<extension name> namespace.
  • Use dependency injection, avoid static calls for other than utility methods + hook entry points
    • Don't overuse private visibility in services
  • Use structured logging, with meaningful levels
  • Expose your Javascript methods so 1) user scripts can access them, 2) it is easy to debug them. Do not make everything private.
  • Create a Vagrant role for your extension.
  • Document hooks used in the extension infobox, it's a nice method of exposing examples so that other developers can learn.
  • Store your extension on gerrit so that others can update it for core deprecations
  • Declare the compatibility policy of your extension.

Would be cool to have a Toolforge tool which can check some of the criteria and provide you with a TODO list.

@Tgr Is a namespace convention like MediaWiki\Extensions\<extension name> or MediaWiki\Skins\<skin name> somewhere documented? Or better are there any examples of extensions that already do use such namespaces? Background of my question is, that the BlueSpice-team is going to use namespaces (and composer based PSR-4 autoloading) for future development. And we are still discussing about conventions. I want our new code to be as close to MediaWiki conventions as possible.

One item that was discussed during the Vienna Hack-a-thon was having some form of rating system that would be available for extensions. This rating would be perhaps based on some objective measures (derived from some the criteria mentioned in this session) as well as some subjective developer ratings. Looking to do something like this for Code Health generally speaking.

@Osnard I don't think there is an established convention and most extensions actually just use <extensionname> but IMO using MediaWiki\Extensions\... follows from the PSR-4 recommendation that namespaces should start with a vendor prefix. Vendor prefixes are reasonably unique; extension names are not (ie. there is no guarantee someone out there is not writing a library with the same name as your extension, and once another extension tries to use that library as a dependency, you are in trouble).

I don't think there is an established convention and most extensions actually just use <extensionname> but IMO using MediaWiki\Extensions\... follows from the PSR-4 recommendation that namespaces

I don't think that is a reasonable argument for extensions that have been using PSR-4 for some time or are fairly distinguishable on its own. I fail to see the correlation between the use of MediaWiki\Extensions\... and being a "high quality MediaWiki extension".

For previous and future discussion about what namespace to use: T166010: The Great Namespaceization Effort

I just created the follow-up task (or at least one of them) for this session. I don't think there's anything left to do *here* except continue the conversation. Let's finish tagging all the items in the etherpad with REQUIRED, SHOULD, and GOLD and then determine next steps after that.

Thanks again, @Legoktm

I wonder if we can play into the libraries.io project or at least steal some ideas in developing our own metrics:

They have some great metrics already defined:
https://libraries.io/packagist/zordius%2Flightncandy
https://libraries.io/npm/react
https://libraries.io/npm/react/sourcerank

We could build on this score by adding metrics such as open patchsets, oldest patchset etc...

I think this is an important part of holding people accountable for code stewardship or having conversations about why those metrics are low.

https://libraries.io/packagist/mediawiki%2Fcore/sourcerank

I like the concept of SourceRank and think something like that would be good for extensions. I think we'd want to add some additional code health attributes though. But I do like the inclusion of things like documentation. Perhaps another Rank attribute is whether or not the extension has a steward.