[GSoC 2016 Proposal] Automated Testing and Integration of IFTTT support to Wikidata
Closed, ResolvedPublic

Description

Project Title

Automated Testing and Integration of IFTTT support for Wikidata

Personal Information

Name: Alangi Derick Ndimnain
Email: alangiderick@gmail.com
IRC Nick: d3r1ck
Github: Github Profile Page

Background Information

  • I am a final year Computer Engineering major, specializing in Software Engineering from the University Of Buea. I was a mentor with Wikimedia Foundation (WMF) at the 2015 Google Code-In and mentored 6 projects.
  • I have been contributing to WMF for about 6 months now and here are my contributions, where I have contributed patches across many different extensions.
  • I am an active member and an organizer for the Google Developer Group in our community(Buea) and have participated in various GSoC(Google Summer of Code) meetups organised in our community to sensitize and mentor young, talented and motivated students to contribute to Open Source movements.
  • Elite Programmers Club is a club that was founded in our University to teach, enhance and empower programming skills to interested students, and I am an administrator and mentor in the club. Here is the link on Facebook.
  • #ublab is an IRC channel on Freenode that we use for the above mentioned club(EPC) and we also use it for communication and I am a channel operator in this channel.

Other Works

  • Worked on an Eggdrop bot for our channel #ublab and customized it to suite our needs, adding Tcl Scripts to add more features to the bot. Here are the codes and Documentation on how to run, use and customize the bot was written by me.
  • This is an Inventory Management System project that is aimed at evaluating us for the completion of CEF415 course in the University of Buea(my institution) and here is the link.
  • Worked on a project in fulfillment of our Second semester course(CEF308) while i was in the second year in the department of computer engineering(level 300). It was named the living website, and [[ | here ]] is the link.
  • This a web version of phone book that also implements some search algorithms using SQL and was built also using the Bootstrap Framework and PHP. Link here.
  • In addition to the many other projects in my Github profile, here are also a list of systems I have built and deployed which are running live;
    • eFarm: An agricultural platform for buying and selling of agricultural products.
    • My personal Website: My personal website which contains some basic information about myself.

Programming Background

  • Hour of Code Certified, view certificate here.
  • Participated in the ACM ICPC contest in 2014, and here is my membership card.
  • Google Code-In 2015 Mentoring - Projects Mentored.
    • T118390: Using Maniphest's advanced search Documentation Screen cast.
    • T116802: Watching a project to receive its notifications Documentation Screen cast.
    • T118389: Screencast showing how to request a project
    • T121911: Remove deprecated html elements like <font> from SemanticForms
    • T108432: Desktop: Search box inaccessible in Special:Gather
    • T122968: Add a composer.local.json-sample to MediaWiki core
  • My Contributions to WMF Code base - Extensions Worked on
    • Echo Extension
    • Gather Extension
    • Graph Extension
    • Mailgun Extension (co-authoring)
    • MobileFrontend Extension
    • Newsletter Extension
    • Semantic Forms Extension
    • Thanks Extension
    • Wikibase Extension
    • WikibaseJavaScripAPI Extension
  • Also worked on OOJs/UI and MediaWiki Core.
  • Skills
    • Languages: PHP (Excellent), Python(Intermediate), JSON (Excellent), JavaScript/jQuery (Intermediate), CSS (Excellent), SQL (Proficient), HTML (Excellent).
    • Tools: Secure Shell, Git/Github, Gerrit, IFTTT, Linux OS (Ubuntu) and derivatives, Subversion, Vagrant, Composer.
    • Frameworks: Flask (A Python Micro-framework)

Project Mentors

Synopsis/Project Summary

This project aims at integrating IFTTT (IF This Then That) feature to the Wikidata extension and extending RSS views for the existing Wikipedia triggers on IFTTT. IFTTT technology being one of the most popular ways of making chains of web services communicate with one another, will be used to make Wikidata to communicate with other web services (IFTTT web service) across the web. This project involves studying the Wikidata API, IFTTT technology internals and providing a new and friendly way to communicate to Wikidata. Adding RSS views will enable users to connect with the service independent of IFTTT.

Detailed Project Description

Introduction

Wikidata is a is a free and collaboratively knowledge base used to keep information about objects (real life entities). These information can be edited and used by both humans and machines. Wikidata stands as a central point of storage for all Wikimedia related projects and the information in Wikidata is available in all languages supported by the Wikimedia projects. In addition, information is stored in Wikidata in the form of a document-oriented manner and saved in the database with prefix Q for an Item. Example Q123 can be an item in Wikidata holding information about a real data like; Social Media.

IFTTT or "If This then That" is a web service that amalgamates many other web applications and services into on place, and can then perform several action based on certain set of criteria. With IFTTT, you can create a channel for an application or a web service, for a example Wikidata. A Wikidata channel can be created in IFTTT which will be used to create a communication between IFTTT and Wikidata. IFTTT technology is used to automate various task on the internet for users based on certain criteria (triggers - "this") and the response of the trigger is an action - "that".

Implementation Approach

Since this feature has been implemented for Wikipedia by @Slaporte (His Implementation) which runs on Wikimedia Labs here, this link can’t be accessed at the moment because Wikidata IFTTT web services uses some credentials (like a channel-key which is kept secret), so the page just shows with a GET request:

{
  "errors": [
    {
      "message": "Unauthorized"
    }
  ]
}

I will use a similar approach in integrating the feature for Wikidata, this will require me to write a web service application in Python (running on the Flask framework) to read data from Wikidata using the MediaWiki and Wikidata API on a regular basis. The IFTTT Wikidata web service application we will develop will stand as an intermediate entity between IFTTT websitte and Wikidata. Below is the set-up or architecture on how the proposed system should work.

When IFTTT does a poll query/request, the IFTTT Wikidata web service application will use the MediaWiki API to poll data which have been updated or changed in Wikidata from previous revision of Wikdata’s data. These updates can be gotten by computing the “diffs” using changes [[ https://meta.wikimedia.org/wiki/Wikidata/Archive/Notes#Revisions.2C_diffs.2C
_recent_changes | [1] ]] by using SQL queries on the wb_changes table for Wikidata. This will enable me to get all the various recent actions to compute the diffs. At this level, the diffs might or might not contain triggers, but if the diffs contain triggers, then IFTTT will perform actions else IFTTT will do nothing.
During the implementation phase, I plan to review other possible triggers based on the data currently available from the Wikidata API.

Algorithm

The Wikidata IFTTT service will be polled on a regular basis by the IFTTT service. When the Wikidata IFTTT service is visited, it will return JSON formatted according to IFTTT's specifications. It will also need to include a simple caching layer (Werkzeug SimpleCache)[[ http://werkzeug.pocoo.org/docs/0.11/contrib/cache/ | [2] ]] to handle the scale of traffic it may see which will range from hundreds to thousand of users. The cache will save data locally for the Wikidata IFTTT service and expire automatically after a set period less than the IFTTT polling interval.

IFTTT site makes POST request to the Wikidata IFTTT web service on a periodic basis;

When IFTTT Wikidata web service receives a request, it makes a request to the Wikidata API to fetch changes from Wikidata if the previous response is not currently cached;

JSON_formatted_data = Response fetched from Wikidata is computed and JSON formatted according to  IFTTT's specification.

if( JSON_formatted_data ) {

      new_data_set = New data returned as JSON(or other formats) is cached in 
			Wikidata IFTTT service;

      IFTTT site performs the actions;

} else { 

       It will return an empty result set (per IFTTT specs);

       Wait for the next query/poll from the IFTTT service;
}

Components of the IFTTT Wikidata web service

Based on the current implementation of this feature in Wikipedia, the various components will be used in the Wikidata IFTTT web service.

  • Core: This is the main application. The core will hold a list of the triggers that will be implemented and also some utilities that will be used in core of the application. The core will mostly consist of; function definition to handle basic HTTP errors like (401, 404, 500 etc..) and also routes that will be used to do the calls to the Wikidata API from time to time to get recent changes.
  • Database Access Layer: This is a component the IFTTT Wikidata what will be using the databases. Any database transaction, mostly SQL queries will be handled in this component of the application.
  • Utilities: This part will just contain functions for handling certain tasks. Basic utilities like; conversion from one time format to another and simple functions to handle very specific tasks. These component in a nut shell will hold general functions.
  • Triggers: These will contain various classes that will handle all the triggers that will be decided and implemented, for example; Creating a New Item with a specific label or hash tag, Editing an item etc... This component will be responsible for reading data using the Wikidata API from Wikidata. These data will be returned as JSON so that IFTTT can feed on it. Wikidata uses the MediaWiki API to check if there is a change in the content of Wikidata using revisions.
  • Views: This will contain a class which will be used by all triggers and the standardized view class will enable us to expose Wikidata (and Wikipedia) triggers in other formats, too, like RSS (Rich Site Summary). This class will have methods what will be used by all the triggers that will be implemented in this app. Its called views because it is what the front-end of IFTTT uses based on the user account settings to gather information and return to the front-end of the user (hence Views). Example method in this class can be a method to handle the post request to the Wikidata API which will be used by all triggers to get specific information from Wikidata.

Testing and Verification

Testing

This phase will be done by a tool called "curl", its a command line Linux tool used to transfer data from or to a server, using one of the supported protocols (HTTP, POST, FTP and more...). Using this tool, we shall mostly be sending POST requests to the Wikidata server in connection with the Wikidata API to pull information (recent changes) from the server which will intern be sent to the IFTTT Wikidata web service to push the data further to the IFTTT site.

Using the "curl" too, we can set the Content-type which is the form in which we want the curl to return the data. It can be in the form of JSON, XML etc... An example of a call to the Wikipedia "article of the day" trigger using curl and the current @Slaporte implementation of the IFTTT app is below:

curl -X POST --header "IFTTT-Channel-Key: <channel-key>" --header "Content-ype: application/json" --data '{"triggerFields":{"lang":"en"}}' http://tools.wmflabs.org/ifttt-dev/v1/triggers/article_of_the_day

The above command will return the article of the day trigger which will be fed to the IFTTT site and sent to all users who has configured their IFTTT accounts to get Articles of the Day trigger.

Automated unit testing shall also be used in other to ease the testing phase or have a pre-test by the jenkins bot before human test. This will make testing easier based on constraints like; syntax, etc... If there is a bot for automated unit testing, certain errors will easily be found and corrected before merging to the main repository. So in this light, a testing bot(jenkins) will be set-up to do automated unit testing.

Verification

This process will involve me to check and see if my implementation of the Wikidata IFTTT web service is functioning as planned. I will verify that every single functionality of the IFTTT web app is working correctly and giving the correct output to be fed into IFTTT site for further processing. In this process, I will verify all the implemented triggers, to make sure they work and return the correct values/data.

Development Schedule/Timeline

This is a plan that will be used during the GSoC period but can be slightly modified as GSoC proceeds

  • April 23 - May 7 (2 weeks)
    • Community bonding period.
    • Study more docs related to the project (Wikidata API, Wikidata extension, IFTTT) in more detail
    • Check current implementation of IFTTT web app for Wikipedia Channel by @Slaporte here.
  • May 8 - May 22 (2 weeks)
    • Community bonding period continue.
    • Discuss with mentors on development strategies.
    • Brainstorm with the mentors and decide on the triggers what will be implemented.
    • Research on Travis CI and how to configure/setup Github repository with Travis CI testing bot.
    • Read docs related to IFTTT and also investigate on the proposed triggers here (permission might be needed to open the file) to select the triggers that will be implemented during the coding period of the program.
  • May 23 - May 30 (1 weeks)
    • Set Up Travis CI for the project in my Github repository for continuous integration building and testing of the project.
    • Decide and set up the actual Github branch or repo that I shall use for development and transfer Travis settings to it, also make the branch ready for committing code for review.
  • May 31 - June 13 (2 weeks)
    • Write tests suits/cases for all the 8 available triggers for the current Wikipedia application which currently has no test suites. Current application is here.
    • Run the test against the triggers to make sure everything is working and submit code for review to the mentors.
    • Documentation and daily reporting of the work done for the weeks along-side when I am writing the code.
  • June 14 - June 20 (1 week)
    • After investigating with the mentors with the triggers we need for Wikidata, select the ones that will be implemented.
    • Start implementing the Wikidata triggers along-side writing their test suites and documenting the code..
    • Submit work for mid-term evaluation.
  • July 21 - July 8 (2 weeks)
    • Second semester examination period. Draft of academic calendar here.
  • July 9 - July 23 (2 weeks)
    • Continue development on the triggers
    • Submit code to the mentors for review and/or testing. Write weekly report on what I have done.
    • Improve on the documentation for the project to meet the demands of the newly implemented triggers.
  • July 24 - July 30 (1 week)
    • Code cleanup for all the newly implemented triggers.
    • Debugging, bug fixes and more testing to the set of implemented triggers.
    • Weekly report development and documentation.
  • July 31 - August 13 (2 weeks)
    • Submit final codes to mentors for review and getting ready to deploy the project on WM Tools Labs
    • Deploy all implemented triggers, test suites and work done to WM tools labs.
    • Testing the whole application on tools labs to make sure everything is working properly and fix any bugs if found.
  • August 14 - August 22 (1.5 weeks)
    • Integration of the Wikidata IFTTT web app to IFTTT and adding of the newly implemented triggers to the IFTTT website.
    • Testing the whole Wikidata IFTTT web app on IFTTT website
    • Improve and wrap up documentation.
  • August 23 - August 29 (1 week)
    • Pencils Down, Code clean up.
    • Improve and review documentation.
    • Final evaluation, Submission of code to Google.

Time Availability

I would be able to offer over 40 hours per week on the project. Also, to meet up with the demands of the project, I will be coding during weekends and regularly informing my mentors on the status of the project and regularly updating my wiki report page. In addition to the time I will put in, during the GSoC program, I will have a 2 weeks period for exams starting from the 21 June - 8 July. After this period, I will put in 40 hours during the weekends to catch up with the work undone. Apart from school and exams, I do not have any other commitment which would derail me from completing this project.

Why Wikimedia Foundation(WMF)?

Wikimedia Foundation which focuses on encouraging the growth, development and distribution of free, multilingual, educational content, and to providing the full content of these wiki-based projects to the public free of charge is an organisation worth working with to make my continent (Africa) especially Cameroon to be sensitized about such organisations. These are projects which enrich knowledge to the society for free. This will go a long way to improve my community and Africa including the world as a whole in terms of education and academics "for free".

Why Wikidata Extension?

I'm aiming at making my Final Year Project (FYP) to focus on Web Services and Data Manipulation, and so this extension has given me the opportunity to work on a project that will help me learn about the structure of such an application in practice, and will give me an opportunity to build something that is used regularly by large number of users. Integrating the IFTTT feature for the Wikidata extension, this has made me to focus on this extension for my GSoC project 2016.

Why me for the project?

First and foremost, looking at the user facing nature of the project (IFTTT integration support to Wikidata) where hundreds of thousands of users will utilize, I am really inspired to work on the project and in addition to that, with the amount of knowledge I have acquired and my deep understanding of the requirements of the project, I think I'm in the best position to execute this project. After the GSoC coding period, I am going to use this project as my Final Year Project in my institution, this will be an opportunity to promote Wikidata and the MediaWiki project within the academia of my community, sensitizing a lot of users in Africa to use this product.

Work after the Summer of Code

Due to the user-facing nature and Long Term Support ( LTS ) of this application in Wikidata, after the summer, I would like to continue this project by maintaining and adding more functionalities to the application. This maintainability will span from user contributions, to personal features and more features that will be added to the project and the floor will be open for other contributors to work on the project and submit Pull Requests (PR). Also, as I mentioned in one of my mails, I will like to make Africa and specifically Cameroon (my country) to know about Wikimedia Foundation and its project so they can be sensitized and contribute to the project and also to make them aware of the free knowledge in the web through Wikipedia. In addition, I will use this project as my B.Eng thesis in my institution which hopefully, will start the sensitization of Wikimedia Foundation (WMF) to my university.

My Contributions

Since I joined the Wikimedia Foundation (WMF) around September 2015 till now, I have contributed in several ways in the improvement of this organisation both in coding and mentoring perspectives.

  • In terms of coding and submission of patches, I have over 20 patches merged, check here and more are still to come.
  • Also in terms of Mentoring, like I mentioned above in my Programming Background section, I mentored the Google Code-In 2015 program for the organisation. I mentored 6 projects.

References

[1] https://meta.wikimedia.org/wiki/Wikidata/Archive/Notes#Revisions.2C_diffs.2C
_recent_changes
[2] http://werkzeug.pocoo.org/docs/0.11/contrib/cache/

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2016, 11:16 AM
D3r1ck01 updated the task description. (Show Details)Mar 6 2016, 12:05 PM
01tonythomas added a subscriber: 01tonythomas.

Looks like the proposal is still in its draft state, sorry for the change in column confusions ;)

Looks like the proposal is still in its draft state, sorry for the change in column confusions ;)

Not a problem @01tonythomas :)

D3r1ck01 updated the task description. (Show Details)Mar 6 2016, 1:02 PM
D3r1ck01 updated the task description. (Show Details)Mar 6 2016, 1:08 PM
D3r1ck01 updated the task description. (Show Details)Mar 6 2016, 2:22 PM
D3r1ck01 renamed this task from [GSoC Proposal] Integration of IFTTT support for WikiData to [GSoC Proposal] Integration of IFTTT support to WikiData.Mar 6 2016, 5:00 PM
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 10:01 AM
D3r1ck01 updated the task description. (Show Details)
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 10:12 AM
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 10:59 AM
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 11:21 AM
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 11:33 AM
D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 2:27 PM

@Niharika, I don't really think its necessary now to move the proposal under Proposals Submitted. Proposal is still under drafting and I am still working on it. It will be better to leave it under Backlog and When i am done with it, i will put it under Proposals Submitted myself.

For this reason, i am moving it back to Backlog :) Thanks for your help.

Well @Niharika, No need to even bother, lets just leave it there. It doesn't hurt. I am almost done with the proposal so lets just leave it under Proposal Submitted. Thanks for your Help. :)

D3r1ck01 updated the task description. (Show Details)Mar 7 2016, 11:32 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 12:03 AM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 8:31 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 8:47 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 9:12 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 9:49 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 9:53 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 10:16 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 10:25 PM
D3r1ck01 updated the task description. (Show Details)Mar 8 2016, 10:29 PM
D3r1ck01 updated the task description. (Show Details)Mar 14 2016, 8:21 PM
D3r1ck01 updated the task description. (Show Details)Mar 14 2016, 8:32 PM
D3r1ck01 renamed this task from [GSoC Proposal] Integration of IFTTT support to WikiData to [GSoC 2016 Proposal] Integration of IFTTT support to WikiData.Mar 14 2016, 8:39 PM
D3r1ck01 updated the task description. (Show Details)
D3r1ck01 updated the task description. (Show Details)Mar 14 2016, 8:46 PM
D3r1ck01 updated the task description. (Show Details)Mar 14 2016, 8:54 PM
D3r1ck01 updated the task description. (Show Details)Mar 15 2016, 5:49 PM
D3r1ck01 updated the task description. (Show Details)Mar 15 2016, 8:47 PM
D3r1ck01 updated the task description. (Show Details)Mar 15 2016, 8:49 PM
D3r1ck01 updated the task description. (Show Details)Mar 16 2016, 7:50 PM
D3r1ck01 updated the task description. (Show Details)Mar 17 2016, 12:13 AM
D3r1ck01 updated the task description. (Show Details)Mar 17 2016, 8:37 AM
IMPORTANT: The deadline for submitting your proposal to Google Summer of Code 2016 application system at GSoC application system falls in roughly 24 hours at Mar 25 2016, 19:00 UTC. Please make sure that you have a pdf copy of your proposal in the application system beforehand, to avoid last minute confusions. Remember to relate your Phabricator task and associate 2 mentors in the proposal description, so that it gets easy for review. Past the deadline, you should only make changes limited to fixing typos, or incorporating feedback's. Good Luck, and check out the micro-tasks!
D3r1ck01 updated the task description. (Show Details)Mar 24 2016, 7:05 AM
D3r1ck01 updated the task description. (Show Details)Mar 24 2016, 7:14 AM
D3r1ck01 updated the task description. (Show Details)Mar 24 2016, 7:18 AM

IMPORTANT: The deadline for submitting your proposal to Google Summer of Code 2016 application system at GSoC application system falls in roughly 24 hours at Mar 25 2016, 19:00 UTC. Please make sure that you have a pdf copy of your proposal in the application system beforehand, to avoid last minute confusions. Remember to relate your Phabricator task and associate 2 mentors in the proposal description, so that it gets easy for review. Past the deadline, you should only make changes limited to fixing typos, or incorporating feedback's. Good Luck, and check out the micro-tasks!

Thanks for the information @01tonythomas

Sumit added a subscriber: Sumit.Apr 22 2016, 7:54 PM

Congratulations @D3r1ck01 for getting selected for this project in GSoC 2016! Wish you a good luck with it. You can start discussing ideas and get to speed with the project as the Community Bonding period has started.

01tonythomas assigned this task to D3r1ck01.

Proposals should be self-assigned

Welcome to Google-Summer-of-Code (2016) and to the Community Bonding period! Happy to have you here, and this should be crucial time to create important decisions regarding how your project should take shape during this two month internship period. You can find information about Community bonding period, from our Life of successful doc here. To make sure everything go as per planned, please follow the instructions in T133647 and create the 'Community Bonding Evaluation for $project' task as a subtask of your proposal task. Please note that all further tasks you create for evaluation and GSoC organization purpose should be subtasks of your proposal, and not the parent task - lets reduce the notification count. In case you are stuck, feel free to comment below T133647 or open up a conpherence task with the mentors and org admins. You can find example tasks in the task description of T133647

Thanks very much for the information @01tonythomas. I shall kindly do as instructed. :)

Niharika removed a subscriber: Niharika.Apr 26 2016, 4:50 PM
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.May 3 2016, 10:36 AM
D3r1ck01 updated the task description. (Show Details)May 7 2016, 6:18 AM
D3r1ck01 updated the task description. (Show Details)
D3r1ck01 updated the task description. (Show Details)
D3r1ck01 updated the task description. (Show Details)May 9 2016, 5:48 AM
D3r1ck01 renamed this task from [GSoC 2016 Proposal] Integration of IFTTT support to WikiData to [GSoC 2016 Proposal] Integration of IFTTT support to Wikidata.May 9 2016, 6:02 AM
D3r1ck01 updated the task description. (Show Details)
D3r1ck01 updated the task description. (Show Details)May 11 2016, 12:00 AM
D3r1ck01 updated the task description. (Show Details)May 12 2016, 6:07 PM
D3r1ck01 updated the task description. (Show Details)May 13 2016, 11:33 PM
D3r1ck01 renamed this task from [GSoC 2016 Proposal] Integration of IFTTT support to Wikidata to [GSoC 2016 Proposal] Automated Testing and Integration of IFTTT support to Wikidata.
01tonythomas closed this task as Resolved.Sep 14 2016, 6:42 AM

End of project timeline. Closing it down. Thank you for all the efforts, see you as a mentor for Google Code in 2016 at g.co/gci.

Outcome of this 2016 round projects at https://www.mediawiki.org/wiki/Google_Summer_of_Code_past_projects#2016