Page MenuHomePhabricator

ApprovedRevs: indexing current state
Open, HighPublic

Description

Tested in:

  • MediaWiki 1.27.7
  • Semantic MediaWiki 2.5.7
  • PHP: 7.3.7-1+ubuntu16.04.1+deb.sury.org+1 (and PHP: 5.6-1+ubuntu16.04.1+deb.sury.org+1)

Steps to reproduce:
With SMW:

  1. Create a new page named 'Page1' with contents {{#set:test=really?}} and save it. Don't approve!
  2. Visit special:browse/Page1 and you'll see that the properties are set, even though the page was not approved.

Without SMW:

  1. Create a new page named Page2 with contents: [[Category::WhatElse]] and save it. Don't approve!
  2. Visit Category:WhatElse and see Page2 is in the list of pages, even though the page was not approved.

Debug information

  1. WikiPage::doEditUpdates is doing defferedUpdates of the latest revision, and not the approved revision.
  2. When extensions make use Revision::newFromTitle(), the function will always return the current revision of the page, even if it's not approved.

Event Timeline

YOUR1 created this task.Jul 11 2019, 4:44 PM
Restricted Application added subscribers: Liuxinyu970226, Aklapper. · View Herald TranscriptJul 11 2019, 4:44 PM
YOUR1 updated the task description. (Show Details)Jul 12 2019, 6:36 AM
YOUR1 updated the task description. (Show Details)
YOUR1 renamed this task from Broken when using SMW to ApprovedRevs: indexing current state .Jul 12 2019, 9:47 AM
YOUR1 updated the task description. (Show Details)Jul 12 2019, 11:13 AM
Rcdeboer claimed this task.Jul 12 2019, 1:49 PM
Rcdeboer added subscribers: Yaron_Koren, Rcdeboer.

Confirmed. I'm working on a patch.

YOUR1 awarded a token.Jul 12 2019, 1:51 PM

That's great to hear!

Rcdeboer reassigned this task from Rcdeboer to Yaron_Koren.Jul 12 2019, 3:31 PM

So, I have what I think to be a start of a potential fix. I cannot seem to get it completely working, though. Perhaps you can have a look, @Yaron_Koren, since you're way more familiar with the code than I am.

First, the root cause of this behavior. This appears to be twofold:
(a) MediaWiki uses DeferredUpdates to execute LinksUpdates in WikiPage::doEditContents. Because of this, the ApprovedRevs LinkUpdates (from ApprovedRevsHooks::updateLinksAfterEdit) are excuted BEFORE the regular MW LinksUpdates are (deferred) executed. MediaWiki uses the content of the latest revision, so the wiki links (categories!) are updated with the content of the non-approved, latest version. Confirmed this behavior by switching off the deferred updates and directly executing LinksUpdate::doUpdate in WikiPage::doEditContents. resulting in correct behavior w.r.t. setting categories.
(b) SMW has its own code to retrieve a revision. By doing so, it retrieves the latest - and not the approved - revision. Therefore, semantic properties show the latest, and not the approved, values.

My approach to a solution: tap into two different hooks, so that
(a) the LinksUpdates constructed by MediaWiki (with the 'wrong', latest revision content) are replaced by LinksUpdates constructed by ApprovedRevisions (with the 'right', approved revision content)
(b) the parser output for a WikiPage always contains the approved content.

For (a), I switched the hook ArticleEditUpdates to SecondaryDataUpdates. I adapted the code of ApprovedRevsHooks::updateLinksAfterEdit as follows:

        /**
	 * Hook: SecondaryDataUpdates
	 *
	 * Call LinksUpdate only on the text of this page's approved revision,
	 * if there is one.
	 */
         static public function updateLinksAfterEdit( WikiPage &$wikiPage, &$editInfo, $changed ) {
		$wikiPage = WikiPage::newFromID( $title->getArticleID() );
		if ( !ApprovedRevs::pageIsApprovable( $title ) ) {
			return true;
		}
		// If this user's revisions get approved automatically, exit
		// now, because this will be the approved revision anyway.
		$userID = $wikiPage->getUser();
		$user = User::newFromId( $userID );
		if ( self::userRevsApprovedAutomatically( $user, $title ) ) {
			return true;
		}

		$content = ApprovedRevs::getApprovedContent( $title );

		// If there's no approved revision, and 'blank if
		// unapproved' is set to true, set the content to blank.
		if ( $content == null )
		{
			global $egApprovedRevsBlankIfUnapproved;
			if ( $egApprovedRevsBlankIfUnapproved ) {
				$content = $wikiPage->getContentHandler()->makeEmptyContent();
			} else {
				// If it's an unapproved page and there's no
				// page blanking, exit here.
				return true;
			}
		}
		$editInfo = $wikiPage->prepareContentForEdit( $content );
		$u = new LinksUpdate( $title, $editInfo->output );
		$updates = array($u);

		return true;
      }

For (b), I switched the hook ArticleFromTitle to ContentGetParserOutput. I adapted the code of ApprovedRevsHooks::showApprovedRevisionas follows:

        static function showApprovedRevision( $content, $title, $revId, $options, $generateHtml, &$output ) {
		global $wgParser, $wgRequest;

                /*		
               $request = $wgRequest;

		if ( ! ApprovedRevs::isDefaultPageRequest( $request ) ) {
			return true;
		}*/
		
		if ( ! ApprovedRevs::pageIsApprovable( $title ) ) {
			return true;
		}

		$revisionID = ApprovedRevs::getApprovedRevID( $title );
		$approvedContent = ApprovedRevs::getApprovedContent( $title );
		$text = ContentHandler::getContentText( $approvedContent );
		$output = $wgParser->parse( $text, $title, $options, true, true, $revisionID );
		
		return false;// Tell MW we've parsed the page, and not to mess with it.
        }

This works reasonably well. Issues I still see are:
(1) When I enable the ApprovedRevs::isDefaultPageRequest block, the fix ceases to work, and unapproved values show up again in SMW properties. Without that block, it is not possible to display historical revisions: whatever you select, the approved revision is shown.
(2) After approving a page, the confirmation page shows the previous approved revision, not the newly approved one.
(3) I have encountered lingering semantic property values, that showed the values of no-longer-present semantic properties. Not sure why, or how to reproduce. Perhaps when you clear a page of all its semantic properties and then approve that revision?

Not quite there yet...

Rcdeboer triaged this task as High priority.Jul 12 2019, 3:33 PM

Triaging as high, since this issue directly affects core ApprovedRevs functionality.

Great. I didn't know about the SecondaryDataUpdates hook! I'm reading about it now - it looks like it was deprecated in March 2018, in favor of the RevisionDataUpdates hook:

https://phabricator.wikimedia.org/rMW7960d5385f1146459671148e2e34d3ba50e17aeb

https://www.mediawiki.org/wiki/Manual:Hooks/RevisionDataUpdates

So ideally the code will support both hooks, depending on the MediaWiki version. The easiest way to do that may be to register these functions not in ApprovedRevs.php or extension.json, but in ApprovedRevsHooks::registerExtension(), which is at /includes/ApprovedRevs.hooks.php. That way, you can put in "if" call in to register with one hook or the other. Thankfully, there was a class added at the same time that this new hook was added, so you can probably add code like the following to that function:

global $wgHooks;

if ( class_exists( 'MediaWiki\Revision\SlotRenderingProvider' ) ) {
    // MW 1.32+
    $wgHooks['RevisionDataUpdates'] = ....
} else {
    // MW 1.25 - 1.31
    $wgHooks['SecondaryDataUpdates'] = ...
}

Hopefully this is only a little bit a hack. :)

You could in theory put in support for the old ArticleEditUpdates hook in that same "if" statement -Approved Revs still currently supports MediaWiki versions going back all the way to 1.23. But I think it's fine to drop support for MW < 1.25 at this point. (That means that switching to the ContentGetParserOutput hook can also be done without any problems, since it was added in MW 1.24.)

YOUR1 added a comment.Wed, Aug 7, 1:43 PM

@Yaron_Koren are you planning to update ApprovedRevs? As of now its kinda broken..

I'm still waiting for @Rcdeboer's full patch. Or is there something I can do already?

YOUR1 added a comment.Wed, Aug 7, 1:54 PM

There are still issues with the patch, which we couldn't solve:

  • When I enable the ApprovedRevs::isDefaultPageRequest block, the fix ceases to work, and unapproved values show up again in SMW properties. Without that block, it is not possible to display historical revisions: whatever you select, the approved revision is shown.
  • After approving a page, the confirmation page shows the previous approved revision, not the newly approved one.
  • I have encountered lingering semantic property values, that showed the values of no-longer-present semantic properties. Not sure why, or how to reproduce. Perhaps when you clear a page of all its semantic properties and then approve that revision?

@Rcdeboer - what's the status of the patch? Are you still working on it?

YOUR1 added a comment.EditedTue, Aug 13, 6:21 AM

@Rcdeboer - what's the status of the patch? Are you still working on it?

Yaron, @Rcdeboer is a colleague of mine, and as mentioned before, there are still some issues we couldn't solve. So we are currently not working on a patch.

Oh, I didn't realize (or forgot) that you work together, and that "we" thus included him. I guess I'll have to look into this, then...

I don't know if that helps but DynamicPageList3 gets the right version
Maybe it has something to do with DPL being pulling data instead of the source page pushing it
As a user I would not mind if I had to start some kind of push job to get the correct version, instead of the refresh being made automatically at saving

There's no difference in terms of pull/push between DPL and the rest - the only difference is that Cargo and SMW are querying their own database tables, while DPL queries core MediaWiki database tables.

Okay, I just checked in a modified version of this patch, at 8c75f7b5aad5 . Sorry, @Rcdeboer - I forgot to credit you when I checked in the code. But I definitely thank you. This code is working a lot better for me - if there are still any issues, with SMW data or anything else, maybe it should go into a separate bug/task.