Page MenuHomePhabricator

Revision summaries use outdated form links
Closed, ResolvedPublic

Description

T195477 changed the way anchors for forms on lexeme pages are structured, now not containing the lexeme part any more.
The lexeme page history shows all revisions of that page incl. the change summary. If a change happenend that added a statement to a lexeme form as property, a link to said form is rendered.

Due to the change of format there seem to be history entries which still contain links using the old anchor format.

AC

  • the lexeme page history links to the forms using the correct

Info

  • I actually assumed this to be ruled out by the fact that the summaries are, at least that is what I thought, only converted to a human-readable format when displayed

Bildschirmfoto von 2018-10-31 14-57-00.png (943×1 px, 160 KB)

Event Timeline

It seems this is only a problem with historic entries. New ones are fine-ish: https://test.wikidata.org/w/index.php?title=Lexeme:L42&action=history

Turns out the wikitext for the comment from the summary is directly in the database as wikitext. Thus is seems not savely possible to fix the old entries.

Screenshots are from system recreating the problem:

Screenshot_2018-11-07 phpmyadmin mw localhost 8080 db-master default revision phpMyAdmin 4 8 3.png (98×433 px, 15 KB)

Screenshot_2018-11-07 phpmyadmin mw localhost 8080 db-master default revision phpMyAdmin 4 8 3(1).png (29×335 px, 2 KB)

I actually noticed a different problem on L42’s history on testwikidata:

(‎Created claim: lexeme form of choice (P77771): merge target/merge source/foo (L103)) (undo)
(‎Created claim: lexeme form of choice (P77771): MediaWikis (L41-F1)) (undo) (restore)

In the older edit (below), the link [[L41#L41-F1]] is rendered as a link to the L41-F1 entity, using the representation of that form (MediaWikis – plural: the lexeme’s lemma is MediaWiki, singular). In the newer edit, however (above), the link [[L103#F1]] is rendered as a link to the L103 entity, using the lemmas of that lexeme, instead of the representations of its first form (which would be just merge sorce). So that’s two related things:

  • The component which prettifies entity links in edit summaries hasn’t been updated for the new form anchors yet, apparently.
  • Could that component also be used to update the anchor from e. g. #L41-F1 to #F1, solving this issue without requiring any database modifications?

Unfortunately, I don’t remember where that component actually is right now, and I can’t find anything relevant in WikibaseLexeme’s wiring files.

Results of the investigation so far:

The relevant pattern is in WikibaseLexeme/src/Domain/Model/FormId.php:

const PATTERN = '/^L[1-9]\d*-F[1-9]\d*\z/';

However, simply removing the first part of the pattern breaks \Wikibase\Lexeme\Domain\Model\LexemeSubEntityId::extractLexemeIdAndSubEntityId, which relies on the two-part form.

I looked a bit into this as well, and I think LexemeHandler::getIdForTitle needs to be updated. It currently tries to “parse” the fragment as an ID; in addition, it should also attempt to parse text + fragment as an ID.

Change 472953 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseLexeme@master] Create lexeme id from fragment

https://gerrit.wikimedia.org/r/472953

Addshore triaged this task as Medium priority.Nov 12 2018, 10:59 AM

Change 472953 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Create lexeme id from fragment

https://gerrit.wikimedia.org/r/472953

  • The component which prettifies entity links in edit summaries hasn’t been updated for the new form anchors yet, apparently.
  • Could that component also be used to update the anchor from e. g. #L41-F1 to #F1, solving this issue without requiring any database modifications?

The first part of this is fixed with the above change, I’ll try to do the second part now.

Change 473586 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Don’t detect fragments of form IDs of other lexemes

https://gerrit.wikimedia.org/r/473586

Change 473585 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Fix the fragment of old-style links to subentities

https://gerrit.wikimedia.org/r/473585

Change 473587 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Fix the fragment of old-style links to forms

https://gerrit.wikimedia.org/r/473587

Alright, I’ve uploaded one possible approach:

It’s not the nicest thing in the world, but I think it’s acceptable. Alternative suggestions are welcome, though.

Related but independent is another fix in WikibaseLexeme.

Change 473586 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Don’t detect fragments of form IDs of other lexemes

https://gerrit.wikimedia.org/r/473586

Change 473587 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Fix the fragment of old-style links to forms

https://gerrit.wikimedia.org/r/473587

Change 473585 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Fix the fragment of old-style links to subentities

https://gerrit.wikimedia.org/r/473585