Page MenuHomePhabricator

Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code
Closed, ResolvedPublic

Description

Background: there are several mainly-JS libraries used by Wikibase and installed during Wikidata build step.
These include:

  • serialization/javascript
  • data-values/javascript
  • data-values/value-view
  • wikibase/data-model-javascript
  • wikibase/javascript-api
  • wikibase/serialization-javascript

All of these libraries contain some PHP code interacting with MedaWiki and are installed as PHP dependencies from Packagist using Composer.
All of them are actually independent from MediaWiki. The only thing they do is registering their code as ResourceLoader modules, and currently also registering QUnit tests to be run using MW test loader. The latter is out of scope here as has been already tackled separately.

Problem: as all these libraries are required by Wikibase/Wikidata, they must be also be deployed together. Unlike other library dependencies of Wikibase, these libraries could not be moved to mediawiki-vendor as they do have MW stuff in them (as said above). Also, ResourceLoader must be able to find files used by modules defined by these libraries.

Below there is a list of options we've come up with so far. This ticket is a place to ask questions, discuss those options, suggest new ones, etc.

  1. Move all those libraries to Wikibase git repository
    • pro: dependency problem goes away
    • con: dependencies on MediaWiki, Wikibase and other libs can creep in without being caught by CI
    • con: code provided by those libs is no longer versioned as now - all only work with current Wikibase master
    • con: even more code in Wikibase.git
    • con: no way for those libs to by used by other users than Wikibase (but are there any?)
  1. Turn those libs into npm packages and check in them into a particular place in Wikibase git repository to be picked up by RL
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and developed independently from Wikibase/MW
    • con: libs are maintained in two places
    • con: needs manual committing after each release of a lib (Composer no longer used)
  1. Turn those libs into npm packages and add them as git submodules to Wikibase git repository
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and developed independently from Wikibase/MW
    • pro: libs only maintained in a single place
    • con: still requires manual action after new version of a lib is released (commiting the update submodule reference)
    • con: using git submodules might cause issues with CI and/or deploying the code?
  1. Turn those libs into npm packages and have a script that pulls the required version of each lib (e.g. using Wikibase's package.json) into a known location next to other Wikibase code
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and developed independently from Wikibase/MW
    • pro: libs only maintained in a single place - how to ensure local changes are not made though? Just ignore this fact?
    • con: the "build" step is not entirely removed
    • could running this script be somehow synced/co-triggered with cutting the deployment branch of Wikibase, so that the process actually is automated?
  1. Move those libraries to the Wikibase git repository and make them npm pacakges.
    • pro: dependency problem goes away
    • pro: build step goes away
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and published independently from Wikibase/MW
    • pro: libs only maintained in a single place
    • con: even more code in Wikibase.git

Any other options to be considered here?

Bonus question: data-values/data-types contains both PHP and JS code ("real" PHP code, not just RL modules). How to deal with it? Split the JS part of out it and handle as other JS-only libs?

See also: T107561: MediaWiki support for Composer equivalent for JavaScript packages

Event Timeline

Option #3 seems easiest to me from my armchair. Submodules do not cause problems with CI or deployments--we do those all the time. The submodule update commit can--theoretically--be avoided by auto-updating submodules in Gerrit, but this may or may not work for our use case here (just throwing it out there).

  1. Move all those libraries to Wikibase git repository
    • pro: dependency problem goes away
    • con: code provided by those libs is no longer versioned as now - all only work with current Wikibase master
    • con: even more code in Wikibase.git
    • con: no way for those libs to by used by other users than Wikibase (but are there any?)
  1. Turn those libs into npm packages and check in them into a particular place in Wikibase git repository to be picked up by RL
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and developed independently from Wikibase/MW
    • con: libs are maintained in two places
    • con: needs manual committing after each release of a lib (Composer no longer used)

Regarding Option 1:

Having your own Git repository and being published as npm package (e.g. for third-parties) are not mutually exclusive. You can run npm publish from any directory that contains a package.json file. Whether it's a sub directory of even outside any Git repo is fine from npm's perspective. (This is why for example, in npm, contrary to Composer, the version number must be declared in package.json).

So you can consolidate the best of option 1 and option 2:

  • Option 5: Move those libraries to the Wikibase git repository:
    • pro: dependency problem goes away
    • pro: build step goes away
    • pro: easy integration with RL (know where files are)
    • pro: libs can be still released and published independently from Wikibase/MW
    • pro: libs only maintained in a single place
    • con: even more code in Wikibase.git

This would be my recommendation (as first step).

Option #3 seems easiest to me from my armchair. Submodules do not cause problems with CI or deployments--we do those all the time. The submodule update commit can--theoretically--be avoided by auto-updating submodules in Gerrit, but this may or may not work for our use case here (just throwing it out there).

If this won't cause any issues with deployments & CI then this sounds like it could be the easiest option out of everything on the table right now.
It would also allow us to think about the long term option of npm packages / js packages and mediawiki in general: T107561

Thanks @Krinkle for pointing this out. This indeed is a reasonable option to consider. Added to the task description.
One concern with the Option 5 I could imagine is that the separation between those libs and the "core" Wikibase is drawn less strict than e.g. when those libs are only pulled in when "building" Wikibase & co (option 4), or when they're technically separate git repos/submodules (option 3). I realize though it might be simply seen as a matter of personal taste (or discipline) which of those options is "simpler" or "better" (tm). Just mentioning here, hopefully I am not seetting the bikeshedding wheel into motion here.

I like Krinkle's idea the most (option #5) for now, mostly because it's probably the easiest, and seems relatively straightforward to move away from in the future.

Full preference list: 5, 3, 2, 4, 1.

One concern with the Option 5 I could imagine is that the separation between those libs and the "core" Wikibase is drawn less strict than e.g. when those libs are only pulled in when "building" Wikibase & co (option 4), or when they're technically separate git repos/submodules (option 3). I realize though it might be simply seen as a matter of personal taste (or discipline) which of those options is "simpler" or "better" (tm). Just mentioning here, hopefully I am not seetting the bikeshedding wheel into motion here.

Yeah, there's a more increased chance that someone will add something MW specific in a folder that's supposed to be a library. We occasionally have people introduce MW-specific code in core's includes/libs/ directory which is all supposed to be independent code, but it usually gets caught in CR. Theoretically there could be a CI test that looks for "mw." or something in directories to prevent it.

Also, if we choose #5 (or whatever), it shouldn't be set in stone, just a holdover until T107561: MediaWiki support for Composer equivalent for JavaScript packages happens.

Addshore raised the priority of this task from Medium to High.Sep 19 2017, 12:30 AM

Marking as high as this is the one undecided thing blocking the killing of the build right now (which itself is high prio)

If you add this code to the Wikivase.git repository you likely make it even harder to split that repository into clean well-defined parts. Suppose you want to put the client and repository extensions in their own git repository, then you'll also need to deal with these JS libs again.

Another downside is that the dependencies become less clear. At least this is what I suspect, I am not sure what multiple NPM packages in a single repo actually look like. Even if you keep the explicit dependencies somehow, it still seems likely that the CI won't enforce them unless you jump though a bunch of hoops.

One concern with the Option 5 I could imagine is that the separation between those libs and the "core" Wikibase is drawn less strict than e.g. when those libs are only pulled in when "building" Wikibase & co (option 4), or when they're technically separate git repos/submodules (option 3).

In addition to the static analysis test @Legoktm suggested, this can also be enforced through unit tests. If the libraries have working unit tests that can be run in these separate repos, they can continue to be run that way when in the Wikibase repo. There shouldn't be any need to run them via MediaWiki QUnit. Check out https://github.com/wikimedia/oojs and https://github.com/wikimedia/unicodejs for examples of simple npm-test libraries that don't involve MediaWiki. The only thing needed is to have the main package.json's test script also execute tests for the individual libraries. This could be as simple as npm install && npm test in each lib directly, or more elaborate/fine-tuned.

This way, they naturally can't depend on MW, just like now.

Just got back from some days off and regarding what @Krinkle said in his last comment: I completely agree those libs should not depend on MW, and should take care of running their tests themselves not relying on MW's QUnit runner, no matter how they're included in the build step. I've actually been changing them that in such manner some weeks ago and most are already made indepedent. Still some work left though, but hopefully will be ready soon!

So we've decided to go with option 3 over the option 5 to avoid the issues @JeroenDeDauw mentioned. There are some reservations in the Wikidata team at WMDE regarding the use of git submodules, so in case we hit any issues, we're going to fallback to the option 5. Seems like a rather safe backup path.

I will be creating relevant tickets, packages and patches in the next days.

Thanks to all who shared their thoughts here!

we should check with @demon if git submodules will be okay with CI / deployment. While I think they might work, I vaguely remember there being some issue with having them in extensions with the deployment tools

@demon commented on this in https://phabricator.wikimedia.org/T174922#3578222:

Submodules do not cause problems with CI or deployments--we do those all the time.

So it looks we're good :)