Page MenuHomePhabricator

Gadget: Autosuggest linking Wikidata item after creating an article
Closed, ResolvedPublic13 Estimated Story Points

Description

Original Wish

Problem: Someone creates an article in Language A Wikipedia, but he does not know there is already the same article in Language B Wikipedia (often a smaller language version), so no Wikidata linkage is made. I have seen many cases between Chinese Wikipedia (zh) and Cantonese Wikipedia (yue).
Proposed solution: After creating an article in Wikipedia, there would be a popup which guides the user to link the Wikidata item. The system will search if there is the same name or similar name of sitelinks already existing in any Wikidata item, if so, the results are listed and the user has to judge whether it is suitable to link. If no existing Wikidata is suitable to link, user can also choose to create a new Wikidata item immediately.
Who would benefit: Wikipedia article creators in all languages

Context

This gadget was developed during the Spring Community Tech WMF Hackathon. The requirements below are meant to be abstracted information without being too invasive. We do not want to betray the intricate details of accuracy of the last time someone was active, we just want to provide useful information about their level of recent engagement on the platforms.

Requirements

As a user of this gadget, I am be able to:

  • See a list of auto suggested site links for the new created article.
  • Save selected wikidata items related to new created article.

Decisions

Possible Further Refinements / Next Steps

  • Indicate a maximum number of desired results from the search
  • Tag edits (in Wikidata) made by our script, in case a senior editor wants to monitor them, or a bot can track them
  • Additional testing to indicate that we are getting a high-precision set of candidate results
    • Especially testing with search results coming from real Wikidata
    • Testing in other projects besides Wikipedia (since this will be offered as a global script)
  • In the documentation page, consider:
    • Adding helpful screenshots
    • Describing Autosuggest's search strategy

Open Questions

  • What if a wikidata item, which matches, already is linked to a different article?
    • Don't show it to the user at all
    • Show it to the user, but with an indication that the item is already linked
      • Show for informational purposes only; not available for linking
      • Or available for linking (in which case remove the existing link)
  • Run the script for a new page only, or for every publish event?
    • If the former, need to find the best way to detect that it's a new page
  • We should automatically refresh the left sidebar after linking, so "Wikidata item" shows up under Tools
  • Should we just show candidate items that have a label in the user's language or the language of the edited article?
  • Could there be a way to easily undo a link created by mistake?

Details

ReferenceSource BranchDest BranchAuthorTitle
repos/commtech/autosuggest-sitelink!24fix-no-translationmainmusikanimalPrevent erroring out when translation page is missing
Customize query in GitLab

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

The script current lives in https://test.wikipedia.org/wiki/User:HMonroy_(WMF)/AutosuggestSitelink.js

To test it, you can add
mw.loader.load('https://test.wikipedia.org/w/index.php?title=User:HMonroy_(WMF)/AutosuggestSitelink.js&action=raw&ctype=text/javascript'); to your user common.js. Here is an example. There will be a pop that would suggest site links from wikidata after the article has been edited and if it doesn't have any related sitelinks.

JMcLeod_WMF set the point value for this task to 5.Nov 2 2022, 7:03 PM

Hola @HMonroy 😊 Regarding what we discussed last week, here's my UX input:

  • For the pop-up, I think it'd be great if we could have a header/title that has an explanation of the action - like "Link a Wikidata item to this article". I also added an icon for this header/title, the icon's name is link. Also as we mentioned in our meeting I added the close icon too. Both icons are here. See image below as an example:

Group 1 (1).png (194Γ—345 px, 10 KB)

  • Regarding UX copy, I made some tweaks to the "No items found" popup.

When there's no item found, the copy can instead be: "We couldn't find a Wikidata item that matches this article, but you can create one". The "create one" part of the sentence would take the user to creating the item. See image below:

Group 2 (1).png (96Γ—324 px, 8 KB)

Hope this helps! Feel free to reach out if you have any questions :)

Issues for further consideration:

  • Moved into the task description

Run the script for a new page only, or for every publish event?

Could it have a manual button as well? I've been using the WikidataInfo user script for years (there are various forks of it). It shows a link to a page's Wikidata item at the top in the contentSub area, and when there's no item found it shows a link to create a new item (with sitelink prefilled). I feel like it'd be great if this gadget could take its place: instead of just creating an item, it'd search for possible matches. But the main thing is that it'd be able to happen at any time, not just during or after editing/creating a page, and not just for new pages.

JMcLeod_WMF changed the point value for this task from 5 to 13.Jan 12 2023, 6:26 PM

@GMikesell-WMF I just want to mention that you are now able to:

The script is initialized with the postEdit hook in MediaWiki. To ease
development, you can set window.AutosuggestSitelinkDebug = true; in your
global.js before the mw.loader.load() line you just added. This will cause
AutosuggestSitelink to show dialog on page load, rather than requiring you  to first make an edit.

This is specified in https://gitlab.wikimedia.org/repos/commtech/autosuggest-sitelink/-/blob/main/README.md.

@HMonroy I've been testing and so far I came up with a couple of possible issues so far.

Test links:

Only under macOS Ventura 13.0 with Safari, Autosuggest does not pop up as seen in the screenshots below

Autosuggest-BrowsermacOS: Ventura 13.0Windows 11
Safari 16: Fail
T308059_Autosuggest_MacVentura_Safari_Fail.png (518Γ—1 px, 214 KB)
N/A
Chrome 109: Good
T308059_Autosuggest_MacVentura_Chrome.png (838Γ—1 px, 336 KB)
Same as Mac
FireFox 108: GoodSame as Windows
T308059_Autosuggest_Windows_FF.png (868Γ—1 px, 313 KB)
Edge 109: GoodN/A
T308059_Autosuggest_Windows_Edge.png (940Γ—1 px, 333 KB)

@HMonroy Another issue that popped up is that it looks like it's cutting part of the last letter in Autosuggest. It doesn't seem like a character limit since as you see it in the 1st and 2nd screenshots, just on the last letter only. From the screenshots above, it also looks like it didn't matter what OS or browser since, it still did the same thing. I'll keep you posted if I come across anything else too. Thanks!

Test links:

T308059_Autosuggest_Words Cutoff1.png (952Γ—1 px, 201 KB)

T308059_Autosuggest_Words Cutoff2.png (876Γ—1 px, 183 KB)

T308059_Autosuggest_Wikidata.png (822Γ—1 px, 170 KB)

Only under macOS Ventura 13.0 with Safari, Autosuggest does not pop up as seen in the screenshots below
Autosuggest-BrowsermacOS: Ventura 13.0Windows 11
Safari 16: Fail
T308059_Autosuggest_MacVentura_Safari_Fail.png (518Γ—1 px, 214 KB)
N/A

@HMonroy I created a new ticket for this, https://phabricator.wikimedia.org/T327950

-@HMonroy Another issue that popped up is that it looks like it's cutting part of the last letter in Autosuggest. It doesn't seem like a character limit since as you see it in the 1st and 2nd screenshots, just on the last letter only. From the screenshots above, it also looks like it didn't matter what OS or browser since, it still did the same thing. I'll keep you posted if I come across anything else too. Thanks!

@HMonroy I also created https://phabricator.wikimedia.org/T327951 for the last letter in the description getting cut off in the popup.

Hello,

I activated the AutosuggestSitelink gadget yesterday:

https://meta.wikimedia.org/wiki/User:M2k~dewiki/global.js

and tried to connect

https://de.wikipedia.org/w/index.php?title=Wilhelm_Christian_von_Schleswig-Holstein-Sonderburg-Wiesenburg&stable=0

to an object ( https://www.wikidata.org/wiki/Q75243783 has been already existing), but got the error message:

''Something went wrong: Unable to parse the messages page [[meta:MediaWiki:Gadget-AutosuggestSitelink-messages/de]]. There may have been a recent change that contains invalid JSON.'' (which I now get after every edit)

Also, in my opinion there should be no (error) message at all when creating a redirect instead of an article.

Are there any plans to activate the AutosuggestSitelink gadget for a broader range of users per default after a test phase or does every single user explicitly have to activate the gadget?

Also see
https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2023/Wikidata/Popup_to_link_to_or_create_a_new_Wikidata_item_after_creating_an_article

Thanks a lot!

Hello, how does the matching of possible candidate items work? It is similar to duplicity?

https://wikidata-todo.toolforge.org/duplicity.php?wiki=dewiki&norand=1&page=Vasco%5FCordeiro

  • Is it only based on equal strings? Is it able to handle different spellings (e.g. equal sounding homophones)
  • Or is it also able to do some translations?
  • Is it able to match based on IDs (e.g. VIAF for persons, IMDb for movies, CAS-ID for chemicals, ASN-ID for flight incidents, various country dependent monument IDs for monuments, SANDRE-ID for rivers in France, ...) to match articles in different languages (if it is able to extract this IDs from the article, i.e. if these IDs are used in templates).

Thanks a lot!

Hello,

I activated the AutosuggestSitelink gadget yesterday:

https://meta.wikimedia.org/wiki/User:M2k~dewiki/global.js

and tried to connect

https://de.wikipedia.org/w/index.php?title=Wilhelm_Christian_von_Schleswig-Holstein-Sonderburg-Wiesenburg&stable=0

to an object ( https://www.wikidata.org/wiki/Q75243783 has been already existing), but got the error message:

''Something went wrong: Unable to parse the messages page [[meta:MediaWiki:Gadget-AutosuggestSitelink-messages/de]]. There may have been a recent change that contains invalid JSON.'' (which I now get after every edit)

Also, in my opinion there should be no (error) message at all when creating a redirect instead of an article.

@M2k_dewiki the messages in de were missing. Please try again :)

Are there any plans to activate the AutosuggestSitelink gadget for a broader range of users per default after a test phase or does every single user explicitly have to activate the gadget?

After the testing phase, the plan is to advertise this gadget in Tech News and leave it up to the communities to add it to their wikis.

Also see
https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2023/Wikidata/Popup_to_link_to_or_create_a_new_Wikidata_item_after_creating_an_article

Do you think this gadget covers the solution for this proposal? Are you okay with archiving it?

Thanks a lot!

Thank you!!

Hello HMonroy,

Do you think this gadget covers the solution for this proposal?

Due to the error message

''Something went wrong: Unable to parse the messages page [[meta:MediaWiki:Gadget-AutosuggestSitelink-messages/de]]. There may have been a recent change that contains invalid JSON.'' (which I now get after every edit)

I have not yet been able to test the gadget, for example to see if the suggested objects are useful matches.

In addition, I would like to understand the used matching algorithm for the suggested objects. For example

  • Is it only based on equal strings? Is it able to handle different spellings (e.g. equal sounding homophones) / soundex / approximative search ?
  • Or is it also able to do some translations for different languages and alphabets?
  • Is it able to match based on IDs (e.g. VIAF for persons, IMDb for movies, CAS-ID for chemicals, ASN-ID for flight incidents, various country dependent monument IDs for monuments, SANDRE-ID for rivers in France, ...) to match articles in different languages (if it is able to extract this IDs from the article, i.e. if these IDs are used in templates).

Is it possible to activate this tool for *all* users of a language version after the testing phase? If users are not aware of the existance of this tool and users have to activate it on a users base the problem is the same as now, i.e. that a lot of users are not aware of the existance of wikidata and the possibility to add interwiki links.

Is it also possible to create a new object (maybe already filled with some usefull values taken from the article), if now suggested object is matching?

Due to the error message

''Something went wrong: Unable to parse the messages page [[meta:MediaWiki:Gadget-AutosuggestSitelink-messages/de]]. There may have been a recent change that contains invalid JSON.'' (which I now get after every edit)

This should be fixed now for German (we'll get a proper bug fix in soon). Some translations are still missing, though. You can add them at https://meta.wikimedia.org/wiki/MediaWiki:Gadget-AutosuggestSitelink-messages

In addition, I would like to understand the used matching algorithm for the suggested objects. For example

  • Is it only based on equal strings? Is it able to handle different spellings (e.g. equal sounding homophones) / soundex / approximative search ?

…

It is only exact matches at this time. Your ideas sound great and we'll brainstorm how to get the algorithm to handle those.

Is it possible to activate this tool for *all* users of a language version after the testing phase? If users are not aware of the existance of this tool and users have to activate it on a users base the problem is the same as now, i.e. that a lot of users are not aware of the existance of wikidata and the possibility to add interwiki links.

Once we've got all the bug fixes in, and consider the tool stable, we'll formally recommend installing as a gadget. From there, your wiki can decide for itself whether it should be default-on or off.

Is it also possible to create a new object (maybe already filled with some usefull values taken from the article), if now suggested object is matching?

Yes, if no item is found you should see a notification with a link to create an item, and with the title and wiki are pre-filled.

It is only exact matches at this time. Your ideas sound great and we'll brainstorm how to get the algorithm to handle those.

Some soundex / approximate string matching might be already implemented in "duplicity", so some code might be reused from this tool:

https://wikidata-todo.toolforge.org/duplicity.php?wiki=dewiki&norand=1&page=Vasco%5FCordeiro

https://en.wikipedia.org/wiki/Approximate_string_matching
https://en.wikipedia.org/wiki/Metaphone
https://en.wikipedia.org/wiki/Soundex
https://en.wikipedia.org/wiki/Levenshtein_distance
https://en.wikipedia.org/wiki/Agrep

After reactivating AutosuggestSitelink in

https://meta.wikimedia.org/wiki/User:M2k~dewiki/global.js

I do not get a pop-up at all, but the following error message in the Google Chrome Developer Tools:

Uncaught Error: module already implemented: ext.gadget.AutosuggestSitelink
   at Object.implement (load.php?lang=de&modules=startup&only=scripts&raw=1&skin=vector:18:142)
   at load.php?modules=ext.gadget.AutosuggestSitelink:1:11
implement @ load.php?lang=de&modules=startup&only=scripts&raw=1&skin=vector:18
(anonym) @ load.php?modules=ext.gadget.AutosuggestSitelink:1

After reactivating AutosuggestSitelink in

https://meta.wikimedia.org/wiki/User:M2k~dewiki/global.js

I do not get a pop-up at all, but the following error message in the Google Chrome Developer Tools:

Do you have the gadget turned on in your preferences? There's no need for that, since you're already loading it via global.js. AutosuggestSitelink should still work despite the error, though.

Currently I dont get a pop-up at all, independent of the setting (on / off) in the preferences.

Currently I dont get a pop-up at all, independent of the setting (on / off) in the preferences.

Did you make any edits? It is currently only shown after making mainspace edits, but we plan to have a link or something so you can use it at any time (T329335).

Did you make any edits? It is currently only shown after making mainspace edits

Yes, I made some edits to articles, which have not been connected to a wikidata object, but no pop-up was displayed.

I'm able to see an error: Uncaught Error: module already implemented: ext.gadget.AutosuggestSitelink. I created T329578 to track this work.

@M2k_dewiki @Lydia_Pintscher Please update your global.js and try again. Your global.js should now be:

if ( mw.loader.getState( 'ext.gadget.AutosuggestSitelink' ) !== null ) {
    mw.loader.load( 'ext.gadget.AutosuggestSitelink' );
} else {
    mw.loader.load( 'https://meta.wikimedia.org/w/load.php?modules=ext.gadget.AutosuggestSitelink' );
}

Thank you for your patience!

Hello, when clicking on "Create new item" the label should not contain underscores ("_") but blanks (" ") between words, i.e. instead of

https://www.wikidata.org/w/index.php?title=Q116780799&oldid=1834069989

(see the german label) it should be

https://www.wikidata.org/w/index.php?title=Q116780799&oldid=1834070122

Thanks!

The tool was not able to find the existing object

https://www.wikidata.org/wiki/Q112519181

for article

https://de.wikipedia.org/wiki/Harald_Lange_(Sportwissenschaftler)

since it did not have a german label, but only an english label. In my opinion, this object should be found as well by the autosuggest gadget.

Hello, when clicking on "Create new item" the label should not contain underscores ("_") but blanks (" ") between words, i.e. instead of

https://www.wikidata.org/w/index.php?title=Q116780799&oldid=1834069989

(see the german label) it should be

https://www.wikidata.org/w/index.php?title=Q116780799&oldid=1834070122

Thanks!

The tool was not able to find the existing object

https://www.wikidata.org/wiki/Q112519181

for article

https://de.wikipedia.org/wiki/Harald_Lange_(Sportwissenschaftler)

since it did not have a german label, but only an english label. In my opinion, this object should be found as well by the autosuggest gadget.

Hello,

with article

https://de.wikipedia.org/wiki/Moritz_Hartlieb_von_Wallthor

the AutoSuggest-Gadget suggested only a single object

https://www.wikidata.org/w/index.php?title=Q23542343&action=history

which has already been connected to article

https://de.wikipedia.org/wiki/Wladimir_von_Hartlieb

Should AutoSuggest-Gadget exclude objects, which are already connected to other articles in the same language?

Or at least give a warning, that the connection has been changed and the other article has been disconnected?