Page MenuHomePhabricator

Implement secondary schemas for joining Jade data to other tables
Closed, ResolvedPublic

Description

These secondary schemas are intended to help in using Jade data for filtering RecentChanges.

Revisions and Commits

Event Timeline

See https://phabricator.wikimedia.org/search/query/zFcopzzmzOG./#R for a bunch of commits representing work that had already been done to develop these schemas.

Dropping my notes here on the current state of the secondary integrations.
We will need to do some refactoring due to our new jsonschema for Jade Entity / proposals, specifically doing some application level handling of facets.

Also note all of these hooks are currently disabled due to the failing tests related to our updated jsonschema.

Right now, there are two link tables that contain summarized data about an entity.
The link tables are currently managed via hooks:

  • DatabaseSchemaHooks
    • LoadExtensionSchemaUpdates
      • Fired when MediaWiki is updated to allow Jade extension to update the database. We add two tables to the db and set a couple of indexes on each table.
      • Table: jade_diff_judgment
      • Table: jade_revision_judgment
  • LinkTableHooks
    • PageContentInsertComplete
      • Occurs after a new article is created. Updates link tables after a new entity page is inserted.
    • ArticleDeleteComplete
      • Removes link when an entity page is deleted.
    • ArticleUndelete
      • Restores link when an entity page is undeleted.
  • LinkSummaryHooks
    • PageContentSaveComplete
      • Occurs after the save page request has been processed. This mostly updates the ‘summary’ data in the link tables.
      • Summary data is the “preferred” proposal data
  • MoveHooks
    • MovePageIsValidMove
      • Specify whether a page can be moved for technical reasons. Right now only checking to make sure pages stay under the Jade namespace.

Adding notes from the discussion @Halfak and I had earlier today re: sql tables, recentchanges , etc...

  • SQL tables
    • jade_facet
      • jadef_id UINT
      • jadef_entity_type VARCHAR
      • jadef_name VARCHAR
    • jade_diff_judgment --> jade_diff_label
      • jaded_id --> jadedl_id UINT
      • jaded_revision --> jadedl_rev_id UINT
      • jaded_judgment --> jadedl_page UINT
      • [need to add] --> jadedl_facet_id UINT
      • jaded_damaging --> jadedl_data (tinyint, smallint -- bigint? Let's make estimates for all.)
      • jaded_goodfaith --^
  • From recentchanges
    • Does this change have an editquality label? (Give me all the changes with no editquality label)
    • What is the label for this change? (Give me all the changes that are marked damaging/goodfaith)
    • Given this revision, what's the label?
    • Index: (jadedl_rev_id, jadedl_facet_id) -- Allows us to (1) join to recentchanges and (2) filter by data value
    • Index: (jadedl_facet_id, jadedl_data) -- Allows us to (1) filter the table for specific values and (2) join to something like revision or recentchanges

Also possibly relevant is: https://mariadb.com/kb/en/set-data-type/

So, it looks like these set types are essentially implemented as bitmasks. I'm not sure I understand how they get indexed though. Still digging to work out how that might work.

Change 591502 had a related patch set uploaded (by Accraze; owner: Accraze):
[mediawiki/extensions/Jade@master] Add new tables and indexes for secondary schema

https://gerrit.wikimedia.org/r/591502

Pushed up a WIP patch set that uses a basic ad hoc solution where the jade_diff_label has tinyint fields for damaging and goodfaith for now. Still need to write a script to populate the new jade_facets table. Also need to clean up any leftovers from the older tables, which will be handled in the child patch set: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Jade/+/591503

Change 591502 abandoned by Accraze:
Add new tables and indexes for secondary schema

Reason:
squashing into 591503

https://gerrit.wikimedia.org/r/591502

Change 596066 had a related patch set uploaded (by Accraze; owner: Accraze):
[mediawiki/extensions/Jade@master] Remove old sql files

https://gerrit.wikimedia.org/r/596066

Moving to review (finally). Got a patchset ready that cleans up all the older sql files and updates tables used in tests.

Change 596066 merged by jenkins-bot:
[mediawiki/extensions/Jade@master] Remove old sql files

https://gerrit.wikimedia.org/r/596066