IMPORTANT: Make sure to read the [GSoC participant instructions](https://www.mediawiki.org/wiki/Google_Summer_of_Code/Participants) and [communication guidelines](https://www.mediawiki.org/wiki/New_Developers/Communication_tips) thoroughly before commenting on this task. This space is for project-specific questions, so avoid asking questions about getting started, setting up Gerrit, etc. When in doubt, ask your question on [Zulip](https://www.mediawiki.org/wiki/Outreach_programs/Zulip) first!
===Brief summary
Open data collections -- like Wikidata -- are created and maintained by volunteers and are thriving on community knowledge. Missing or new information needs to be added by community members, otherwise the knowledge base will dry out and be obsolete after a while. Hence, completing a knowledge base is a crucial task within any community-driven initiative. Consequently, helping people to integrate their knowledge into a knowledge base will benefit the growth, correctness, and topicality of Wikidata.
In our earlier research [1] we already showed that it is possible to find outliers in graph-based knowledge bases which need to be checked by experts to ensure the data quality. Our recent implementation of a tool called [[ https://wikidatacomplete.org/ | Wikidatacomplete ]] (c.f., [2]) shows how facts are extracted from text and offered to users for validation. After this process step the validated fact is pushed to Wikidata for later integration. Hence, identifying demands for the Wikidata completion is possible and already has proven its value.
We propose here to implement a Wikibase plugin that is dedicated to facilitate the Wikidata completing process. While navigating through Wikidata, the plugin will show to the user facts extracted from textual sources as well as other knowledge bases (e.g., Wikipedia) which need to be validated. Hence, the Wikibase plugin is showing users suggestions of facts that should be added or changed within the Wikidata knowledge base. To compute the suggestion previously developed services will be used.
Additionally, a badge Web service interface needs to be integrated allowing users to integrate their badge into their profiles of social networks (e.g., on Wikidata’s user page, GitHub profile, Linkedin profile) to show their dedication and motivate other users to contribute, too. A rule-based system for earning badges needs to be implemented.
===Skills required
- JavaScript
- basic knowledge of PHP might be useful
===Mentors
@DD063520 @AnBo-de @Gabinguo @Aleksandr.perevalov
===Microtasks
- Get familiar with data structures available in Wikidata
- Select 3 Wikidata entities and manually find missing facts based on external data sources
- Understand the [[ https://wikidatacomplete.org/ | Wikidatacomplete ]] UI and APIs
- Activate the [[ https://www.wikidata.org/wiki/Wikidata:Recoin | Recoin Wikibase plugin ]] in your Wikibase account to see a similar suggestion mechanism
- Set up the [[ https://www.mediawiki.org/wiki/MediaWiki-Docker | MediaWiki development environment ]]
- Understand how a Wikimedia plugin is working will provide suggestions: install the [[ https://www.wikidata.org/wiki/Wikidata:Recoin | Recoin Wikibase plugin ]] in your development environment and analyze the [[ https://www.wikidata.org/wiki/User:Vvekbv/recoin.js | source code ]] to learn how a plugin works
[1] Didier Cherix, Ricardo Usbeck, Andreas Both, and Jens Lehmann (2014). Lessons learned—the case of crocus: Cluster-based ontology data cleansing. In European Semantic Web Conference (pp. 14-24). Springer, Cham.
[2] Bernhard Kratzwald, Guo Kunpeng, Stefan Feuerriegel, and Dennis Diefenbach. IntKB: A Verifiable Interactive Framework for Knowledge Base Completion. International Conference on Computational Linguistics (COLING), 2020
//Remark: A long version of the project description is available [[ https://docs.google.com/document/d/1wP9A_4CtvfMlYxkBn-C6DunlmVKgj4LfWv6sNMQNyxA/edit?usp=sharing | here ]].
//