Page MenuHomePhabricator

Review Codex module implementation for ResourceLoader
Closed, ResolvedPublic

Description

The Design-System-Team has taken the feedback from the code splitting proposal (T344386) into an actionable implementation plan (T349423). As part of this, there is a series of open questions that we'd like assistance from the MW Platform team on answering.

From the task description of T344386:

  • How can we provide backwards compatibility for code that depends on the @wikimedia/codex module and expects that to load the full library? Or is this not feasible, and do we have to accept this as a breaking change (making that code use e.g. codex-all instead)?
    • One way we might be able to do this: add some sort of special treatment for the @wikimedia/codex module in RL so that other modules can call require( '@wikimedia/codex' ) even if that module isn't loaded, and can add things to the object that that require call returns. Then modules using CodexModule would not depend on the @wikimedia/codex module and it wouldn't be loaded. The @wikimedia/codex module would contain the full library and would only be loaded if it was explicitly depended on.
      • If we want to provide a custom error message for code that attempts to access missing components (see implementation plan step 1), then we would need a way to wrap the object returned by require( '@wikimedia/codex' ) in a Proxy. Perhaps we could accomplish this through a feature similar to skipFunction that allows a module to run a short snippet of code to provide the value require() should return when the module has not yet been loaded?
  • Is there (or could there be) a cleaner way to override the packageFiles property in ResourceLoader\FileModule? Overriding getPackageFiles doesn't work, because getDefinitionSummary calls the private expandPackageFiles method that then pollutes a cache that getPackageFiles uses. We can work around this by overriding getDefinitionSummary, but if any other methods were to call expandPackageFiles in the future we'd have to override them too, so that doesn't seem like a great approach.
  • How does a module using CodexModule add components to the collector module when it executes? Does it wrap the main script with a new file that adds the components before calling the old main file?
  • How do we ensure cache invalidation behaves correctly for a module using CodexModule? What do we need to do in getDefinitionSummary and/or elsewhere to make this work?
    • We need to invalidate the cache when one of the Codex files changes, or when the set of files that is loaded changes. The latter can happen when the manifest files change, or when the module's list of dependencies changes, or when one of the module's dependencies changes which Codex files it loads.
    • Adding the Codex files to the packageFiles and styles properties should take care of invalidating the cache when those files change. Should we just add the list of files that will be loaded to getDefinitionSummary?
  • Should we cache reading the manifest files and deriving the information we need from them? Would it make sense to build a data structure derived from the manifest files that is in a more convenient format for building module contents, and caching that in memcached, invalidated by a hash of the contents of the manifest file or something like that?

Details

Other Assignee
Krinkle

Event Timeline

Krinkle renamed this task from Provide feedback on open questions related to ResourceLoader to Review Codex module implementation for ResourceLoader.Nov 2 2023, 2:05 PM

I've posted some comments on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/970908 and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/972067 too.

  • How can we provide backwards compatibility for code that depends on the @wikimedia/codex module and expects that to load the full library? (…)

It seems that for now, in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/970908, this was resolved by keeping the @wikimedia/codex name for the full library, and using ./codex-subset.js for the partial access.

I'm not sure how to solve it if the @wikimedia/codex module was turned into the hypothetical collector module. The solution outlined here seems like it could work. It sounds pretty complex though… I'm not sure if it would be worth it.

Also, "expos[ing] the Codex components that have already been loaded" would mean that you can access them without specifying dependencies and without loading the whole library (or components) yourself? Sounds a bit like a recipe for problems, and I'm not sure what's the use case. Presumably if you want to maximize performance, then you'll specify components anyway, and if you want to maximize convenience, then you'll load the whole library anyway.

  • Is there (or could there be) a cleaner way to override the packageFiles property in ResourceLoader\FileModule? Overriding getPackageFiles doesn't work, because getDefinitionSummary calls the private expandPackageFiles method that then pollutes a cache that getPackageFiles uses. We can work around this by overriding getDefinitionSummary, but if any other methods were to call expandPackageFiles in the future we'd have to override them too, so that doesn't seem like a great approach.

You could make $this->packageFiles private, and add a protected getter method (perhaps getPackageFilesFromDefinition) that you could override. I'm not sure if this is better.

Ultimately we can only do so much with the design to defend against future changes to FileModule causing bugs. I would just do whatever is convenient, and add a small integration test that will fail if FileModule accidentally "skips" your code. It looks like the tests proposed in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/972067 already cover that.

  • How does a module using CodexModule add components to the collector module when it executes? Does it wrap the main script with a new file that adds the components before calling the old main file?

I suppose it could? There are probably lots of ways to do this, and I haven't thought about the collector module enough to suggest anything concrete. I think you and Roan know better than me at this point what is feasible to do.

  • How do we ensure cache invalidation behaves correctly for a module using CodexModule? What do we need to do in getDefinitionSummary and/or elsewhere to make this work?

It looks like you already figured it out, but just to summarize: the result of getDefinitionSummary needs to reference all files and contain all data that were used to generate the output of the module, so that when any of them changes, the module gets invalidated.

One way to do this is to not override any of the methods that generate the output (like getScript), and instead just build a complex set of options that is handed to the parent FileModule, which we trust to handle it correctly. You did exactly this, and I'm a fan of this approach. :)

If some future improvements require overriding the output directly rather than the options (e.g. if the fancy dependency deduplication requires adding some code that doesn't fit in the 'packageFiles' pattern, or something), then you'll need to also override getDefinitionSummary to add any new files/data to its result.

  • Should we cache reading the manifest files and deriving the information we need from them? Would it make sense to build a data structure derived from the manifest files that is in a more convenient format for building module contents, and caching that in memcached, invalidated by a hash of the contents of the manifest file or something like that?

It's probably not needed, we do worse things in OOUI module definitions (my fault) and they're plenty fast. I wouldn't worry about this unless you discover that it's too slow in production.