Page MenuHomePhabricator

Investigate the status and tooling of JSON-PATCH standard
Closed, ResolvedPublic

Description

  • investigate current status of JSON-PATCH (is it still state-of-the-art?)
  • investigate PHP tooling support for JSON-PATCH (compared to MERGE-PATCH)
  • investigate solution approaches
    • Three possible solutions:
      • 1. Load item -> serialize -> patch -> deserialize -> save;
      • 2. Translate patch request into something format agnostic (= events?) -> apply -> save
      • 3. Load item -> serialize -> patch -> ReplaceItemStatement use case
    • Thoughts on approach 1
      • Way less effort, opportunity to use 3rd party tooling
      • Implies the existence of some canonical json representation
      • Tricky to fit into the architecture, mixing JSON formatting and modification
    • Thoughts on approach 2
      • More effort
      • More control on changes as they happen, since we can validate each operation individually using domain objects instead comparing the original Item to the (deserialized) Item after the PATCH
      • single operations may result in inconsistent statement state, e.g. change snaktype from novalue to value and only adding the datavalue in a different operation
        • can be mitigated by using a builder object
      • requires each value type to have 1. a class that creates events from patch operations, 2. an event handler for each event type, 3. a builder for the value
      • need to figure out where to translate from JSON PATCH to events: should the use case request contain events or the raw JSON PATCH operations?
    • Thoughts on approach 3
      • can reuse validation of the PUT use case
      • gets tricky if PATCH and PUT are supposed to have different behaviors wrt edit metadata (PATCH information is lost when we get to the use case)
      • awkward, because the item gets loaded twice

Solution prototype with approach 2, and either 1 or 3:

  • replace main snak value
  • add qualifier

Event Timeline

Jakob_WMDE renamed this task from Investigate the status and tolloing of JSON-PATCH standard to Investigate the status and tooling of JSON-PATCH standard.Aug 8 2022, 8:40 AM
  • I couldn't find anything online that would be better than JSON-PATCH for a REST API.
  • The proposal git repo has some useful points on JSON-PATCH vs JSON-MERGE. I agree with the decision that JSON-PATCH is preferable over JSON-MERGE.
  • swaggest/json-diff (GitHub, Packagist) seems like the best PHP library to use
    • Most "installs" on Packagist with 4.7M
    • Seems fairly well documented
    • 95% code coverage
    • No security advisories (according to packagist), and no open bugs on GitHub
  • There are also JSON-PATCH libraries in other languages of various quality:

Change 823610 had a related patch set uploaded (by Ollie Shotton; author: Ollie Shotton):

[mediawiki/extensions/Wikibase@master] [DNM] REST: JSON-Patch Solution 1

https://gerrit.wikimedia.org/r/823610

General notes
  • In the proposal schema, the PUT request JSON body contains a patches field, which is a JSON array of objects, but described as "A list of JSONPatch objects as defined by RFC 6902". This is confusing, ambiguous, and possibly incorrect. The RFC usees the terms JSON Patch document and operation object but never "JSONPatch objects" which could be interpreted as either of those terms. IMO, this property should be named patch and described as "a JSON Patch document as defined by RFC 6902", since a JSON Patch document is "a JSON [RFC4627] document that represents an array of objects".
Notes about Solution 1 PoC
  • Previously I implemented a "Statement ID shouldn't change" check in the ReplaceItemStatementValidator. I know think this check should happen in the Data Model's StatementList:replace() method. Then it only happens in one place, rather than in the ReplaceItemStatementValidator and the UpdateItemStatement use case.
  • Wikibase\DataModel\Serializers\StatementSerializer is used instead of Wikibase\Repo\RestApi\Domain\Serializers\StatementSerializer because Wikibase\DataModel\Deserializers\StatementDeserializer only accepts (associative) arrays and not objects. This works for StatementSerializer as the only change between the DataModel and RestApi versions is objects are used instead of associative arrays; there are no changes to the structure of the serialized Statement.
  • CI for the patch is currently failing due to not finding the newly installed Swaggest/JsonDiff package. I added it to the Wikibase composer.json and can see that CI installs the package but phpunit complains. Is there something else I need to do when adding a new package?
Notes about Solution 2 PoC
  • Didn't get round to doing a PoC for Solution 2 but did have a good think about it
  • How best to translate PatchEvents into changes to DataModel objects? JSON Patch is almost too flexible to make changes to the DataModel object directly? E.g. you can't change the value of a DataValueObject, or the DataValue of a SnakObject. So to change a Statement's MainSnak DataValue value, you would have to create a new SnakObject (of the correct subclass) with a new DataValueObject (of the correct subclass) and use the Statement::setMainSnak() method to replace the whole SnakObject. To replace a Qualifier's DataValue value, you would require a similar set of operations, as well as removing the old SnakObject and adding the new SnakObject to the SnakList. Not impossible, but a lot more challenging than Solution 1 or 3.
General notes
  • In the proposal schema, the PUT request JSON body contains a patches field, which is a JSON array of objects, but described as "A list of JSONPatch objects as defined by RFC 6902". This is confusing, ambiguous, and possibly incorrect. The RFC usees the terms JSON Patch document and operation object but never "JSONPatch objects" which could be interpreted as either of those terms. IMO, this property should be named patch and described as "a JSON Patch document as defined by RFC 6902", since a JSON Patch document is "a JSON [RFC4627] document that represents an array of objects".

Reading through https://github.com/wmde/wikibase-rest-api-proposal/blob/master/DECISION-RESEARCH.md#updating-resources I am almost certain "patch object" was an oversight/mental shortcut of "JSON Patch operation objects". And array of these is in essence a "patch document".
Nonetheless I agree that our implementation should use established terms.

Yes, I would have thought so to. Just to be clear, my suggestion wasn't about changing the structure of the field (that looks fine) but the name of the field (patches -> patch) and the description (JSONPatch objects -> JSON Patch document). =)

After some discussion among the devs, we decided that option 2 is probably not the way to go mainly for the reasons already outlined in the "thoughts" in the description. We'll write an ADR explaining the decision further: T315890.

Change 826548 had a related patch set uploaded (by Ollie Shotton; author: Ollie Shotton):

[mediawiki/vendor@master] WIP: Add swaggest/json-diff:3.8.3

https://gerrit.wikimedia.org/r/826548

Change 823610 abandoned by Ollie Shotton:

[mediawiki/extensions/Wikibase@master] [DNM] REST: JSON-Patch Solution 1

Reason:

Only meant as a proof of concept patch.

https://gerrit.wikimedia.org/r/823610