Page MenuHomePhabricator

Define required database schema for Edit Recovery
Closed, ResolvedPublic

Description

What is it that we would actually like to store in the database to be able to make edit recovery functionality work i.e.:

  • store edits
  • revision
  • page
  • text

Acceptance Criteria:

  • explore existing schema to be able to identify where our data might be stored
  • create a ticket to be sent to data persistence team to request schema changes (Good example ticket) not required for indexedDB
  • write queries that will be stored in the ticket that dbas will run not required for indexedDB
  • specify predicted usage of feature, keeping in mind expected purge after 90 days

Related Ticket

Event Timeline

Now that we're using indexedDB to store the data locally to the web browser, a database schema isn't such a concern.

However, it's still something that we define, and it looks like the following fields will be stored:

  • Key: page name (not ID)
  • Value: JSON object with the following fields:
    • pageName (unique key; used for lookup)
    • lastModified (UTC timestamp of last change)
    • field_wpTextbox1 (main wikitext textarea)
    • field_wpHeaderTextbox (ProofreadPage textarea)
    • field_wpFooterTextbox (ProofreadPage textarea)
    • field_wpSummary (edit summary, and new-section subject)
    • field_wpWatchthis (watchlist checkbox)
    • field_wpWatchlistExpiry (watchlist expiry time)
    • field_editRevId (hidden field)
    • field_oldid (hidden field)
    • field_parentRevId (hidden field)
    • field_format (hidden field)
    • field_model (hidden field)
    • field_mode (hidden field)

The fields starting with field_ correspond to the names of the HTML edit form fields. There are others that do not need to be saved because on re-opening a page for editing their default/new values are correct and don't need to be overwritten for correct edit recovery.

Note that there's no user ID field. This is because a) when an anon user edits, we want to be able to restore even after they've logged in; and b) when a logged in user logs out (i.e. and a different user logs in) all data is deleted so there's never any need to look up a page's data by user.

I'm not sure about the content type stuff — what do we expect to happen if someone finds themselves restoring into a different content type? This is sort of fine in some situations where a page is changed between text types, but maybe it should be avoided.

Also, do we need all of field_editRevId, field_oldid, and field_parentRevId? How should things work when someone's restoring over a previous revision? Need to make sure that edit conflicts work the same as if the restoring user had never left the page.

Samwilson renamed this task from Define required schema changes for auto save functionality to Define required database schema for Edit Recovery.Jul 14 2023, 9:26 AM
Samwilson updated the task description. (Show Details)

We talked about this at CommTech's engineering collab session today, and think that it makes sense to store almost all form fields and also make it possible for extensions to opt out of having their fields stored (I've created T342200 for this). There are a few fields that don't change and which it doesn't make sense to store, such as wpUnicodeCheck and wpAntispam, so these can be specifically excluded.

The fields stored will vary depending on which extensions are enabled on a wiki.