Security review WikibaseLexeme extension
Closed, ResolvedPublic


Project Information

Description of the tool/project

The WikibaseLexeme extension sits on top of MediaWiki-extensions-WikibaseRepository and implements a new entity type that allows storing lexiographical data (information about words in all languages) in a more structured way than the default Wikibase entity types could. See for this new entity type. The target audience is the Wikidata community, together with all existing Wiktionary communities.

The WikibaseLexeme extension provides all backend services required to make the new entity type work with the framework Wikibase provides, as well as a dedicated frontend for the new "Lexeme" namespace.

Description of how the tool will be used at WMF

The WikibaseLexeme extension will be activated on to be able to collect lexicographical data in a new namespace called "Lexeme". It will later be activated on existing Wiktionary wikis (one by one) to be able to use the data there.


Has this project been reviewed before?


Working test environment


The responsible team is Wikidata via @Lydia_Pintscher.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 7 2018, 4:54 PM
Lydia_Pintscher updated the task description. (Show Details)
Bawolff claimed this task.Feb 26 2018, 4:50 PM

Just to confirm this is ready for review? The [WIP] in the title of the bug confuses me.

p.s. in regards to

Please link or describe setup process for setting up a test environment. @WMDE-leszek, can you help with this?

Don't worry about that. Its a MW extension, I only need instructions if the setup is super weird.

Bawolff moved this task from Backlog to In Progress on the Security-Reviews board.Mar 1 2018, 10:48 PM
WMDE-leszek updated the task description. (Show Details)Mar 2 2018, 8:24 AM

Removed the point on describing the setup. Sorry, I forgot to do it earlier. Having installed Wikibase, the setup is really just wfLoadExtension, unless one wants to customize stuff.

Embarrassingly, README file in the extension is empty. See adding the first draft.

Re whether this is still WIP, I take the liberty to say it is ready to be reviewed and remove the WIP.
@Lydia_Pintscher please pull the brake if what I say is not valid.

WMDE-leszek renamed this task from [WIP] Security review WikibaseLexeme extension to Security review WikibaseLexeme extension.Mar 2 2018, 8:32 AM

I added the WIP originally. The only open dependency is a separate security review of, which needs a separate Security-Reviews ticket. Everything else is resolved, so this is ready to go from my point of view. :-)

Review of WikidataLexeme & php-vuejs-templating (To be clear, I didn't look at vue.js in general, since that was already reviewed by Darian afaik).

  • "FormIdFormatter.php" line 69 & "SenseIdFormatter.php" line 73 - It looks like this is not properly escaped. However, it also looks like this is just a couple hard coded examples. Kind of confused by this class tbh.
  • "SensesView.php" line 127 - $sense->getId()->getSerialization() - should be html escaped I think.
  • "FormsView.php" line 86 - $this->textProvider->get( 'comma-separator' ) - Please avoid using messages as raw html in new code.
  • [This may be an issue with Wikidata in general] ClickJacking: Since this allows edit interaction directly on wikipage, it should take steps to prevent click jacking. Either javascript should detect when the page is being framed, and refuse to load the editing related js code (Since the editing related code only happens if js is enabled, its safe to detect this condition in JS), or the extension can call OutputPage::preventClickjacking() (Which will totally disables frames altogether for both js and non-js users).

In php-vuejs

Change 416406 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add missing htmlspecialchars() to SensesView

thiemowmde added a comment.EditedMar 5 2018, 10:33 AM

I uploaded fixes for two issues:

  • FormIdFormatter and SenseIdFormatter are dummy implementations. The escaping is fine. I created T188899: [Task] Remove hard-coded demo data from Lexeme code base to properly deal with these later.
  • comma-separator is a message from core, and intentionally contains HTML (but no wiki syntax). As far as I can see we are using it correctly. Yes, we are aware messages with HTML are problematic, and we avoid it whenever possible.
  • Click-jacking prevention is a general issue for all of Wikidata.
    • MediaWiki does have a global $wgEditPageFrameOptions setting for the "X-Frame-Options" HTTP header that defaults to "DENY". The code using this setting is \OutputPage::getFrameOptions().
    • There are methods that should be called to make sure all pages intentionally set or don't set this HTTP header:
      • \OutputPage::preventClickjacking() and \OutputPage::allowClickjacking().
      • Alternatively, \ParserOutput::preventClickjacking( true ) can be used.
    • TODO: Create an investigation ticket to better understand clickjacking prevention, and how and where Wikibase code needs to utilize it.
thiemowmde moved this task from ready to go to in progress on the Wikidata board.
thiemowmde moved this task from Backlog to Review on the Wikidata-Sprint-2018-02-28 board.

Change 416406 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add missing htmlspecialchars() to SensesView

thiemowmde triaged this task as Normal priority.Mar 5 2018, 1:48 PM

Thanks a lot @Bawolff for the review. We believe we've addressed all the issues pointed out, as elaborated above by @thiemowmde. Could you please have another look whether we're good now?
As mentioned above, some of the issues were not "fixed". If you think those still need to be addressed, please say so (e.g. if you are convinced this is the time Wikibase in general should solve the click-jacking issue). We would appreciate any other remedy suggestion wherever you have one!

re: FormIdFormatter and SenseIdFormatter - I thought they might later be extended to a real implementation, which is why I was concerned, but as long as its just a dummy implementation that's eventually going away, that's all cool.

re: click-jacking: Yeah, it really is a wikidata issue. It does not block deployment of wikibaselexeme. As long as someone intends to look into the issue, I'm happy.

comma-separator is a message from core, and intentionally contains HTML (but no wiki syntax). As far as I can see we are using it correctly. Yes, we are aware messages with HTML are problematic, and we avoid it whenever possible.

In MediaWiki core, that message is always used as wfMessage( 'comma-separator' )->escaped(); WikibaseLexeme should similarly escape it. (In mediawiki core, the wfMessage()->escaped() format does not double encode entities, so   stays as   after escaping).

So I'd still like to see the comma-separator thing changed (which is very a minor issue), but otherwise this looks good and is good to go.

Change 418715 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Escape HTML in comma-separator message in FormsView

Change 418715 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Escape HTML in comma-separator message in FormsView

Thanks @Bawolff. So the comma-separator escaping happened in Click jacking deserves its own ticket, so I've filed T189491 about it.

Hi @Bawolff, it's me again. With would you be able to claim the security review was done?

Bawolff closed this task as Resolved.Mar 19 2018, 1:59 PM

Yep, you are good.