= Project Title =
Wikimedia Emoji Bot for Commons Images
= Personal Information =
Name: **Alangi Derick Ndimnain**
Email: **alangiderick@gmail.com**
IRC Nick: **d3r1ck**
Github: **[[ https://github.com/ch3nkula | Github Profile Page ]]**
Meta: **[[https://meta.wikimedia.org/wiki/User:Alangi_derick | Meta User page]]**
= Project Mentors =
Primary mentor: @Dereckson and @ArielGlenn
Co-mentor: we may add a co-mentor, tbd, @MelodyKramer expressed interest
Support: @Yurik for JavaScript/Node-specific issues
= Synopsis / Project summary =
Wikimedia Emoji Bot project is a project meant to provide access to Twitter users to get Commons images counterparts of tweeted emojis. So basically, when a Twitter user tweets a valid emoji ( say: Smile Emoji :) ), the bot should reply the user's tweet with a Wikimedia Commons image (randomly selected from a smile domain space of images) corresponding to someone smiling. In addition, the bot should also be able to interpret text based emojis like this ( ^-^ ), then match it to a corresponding commons image.
**Why this project?**
Wikimedia Commons usually have very beautiful pictures that people might want to see all over the world, with this project, text/gif based communication with a bot like this can reveal some very beautiful pictures on Commons making people aware of a platform like this, hence, increasing Wikimedia Common users so they can also add more pictures to the platform. In addition, with this project and Twitter as a platform that powers many users, this is an opportunity to increase awareness of Twitter users to know about the Wikimedia movement and its projects (and how they can use it to impact themselves and their community), hence, fulfilling the goal of the movement.
The project can be executed in 3 months and will benefit the community in the sense that more and more users will get to know what commons is all about and hence likely to contribute to it making this project a valid GSoC project (based on its scope). It's not impossible to complete the project in this time frame (3 months), this doesn’t mean that the project is simple/easy. This is just the time a GSoC participant can use to execute a project as such.
**Why me for this project?**
I have been contributing and working with the Wikimedia community for more than 1.5 years now in various perspective spanning from community to technical (contributions). This project is really cool and exciting. With my expertise in the development and use of APIs (Wikimedia APIs inclusive) and also in the immediate technologies needed to accomplish this project. With a software engineering background and with previous years experience in GSoC 2016, I have the capability to execute this project and determined to learn additional skills that might be needed in the project that I don’t yet have at this moment (during the community bonding period). Also, with my programming experience of over 7 years and working with web applications and numerous projects, I have all it takes to execute this project in 3 months.
My previous GSoC project was a web service application to connect Wikidata APIs with IFTTT (IF This Then That) API and this project is similar in the sense that I will be working with a web service to connect Twitter’s API with Wikimedia Commons. With this experience, I am able to finish a GSoC project and this one is no different, I have all it takes to finish this project. I have already started doing some work on the project so far and you can check the **Project repository** section below.
In addition, I just recently built a Node JS Twitter bot to retweet tweets with my account matching particular keywords; the keywords I used are; “#wikimedia”, “#wikipedia”, “@mediawiki” etc… I am making it Wikimedia related to keep the culture running. I have deployed the application on WM tool labs, you can check here: [[https://github.com/ch3nkula/Wiki-Retweet-Bot | Codes on Github]] and the application is running successfully on tool labs (with name "wiki-retweet-bot") and does retweet for me matching the keyword(s). With this clear cut knowledge from developing a small bot to deployment and see it work using the Twitter APIs, I can complete this project during the summer of code period.
**Project repository**
* Github link to the project (as at now): [[https://github.com/ch3nkula/Wikimedia-Emoji-Bot | Wikimedia Emoji Bot]] project on Github.
* So far, some pre-requisite work has been done on the project such as; configs, documentation and basic setup has been tackled on the repo.
* The project has been hooked up for Continuous Integration (CI) testing on Travis and builds are passing. :)
= Project goals =
* Valid Twitter account for the bot called: Wikimedia-Emoji-Bot.
* Twitter users can tweet emojis to the Bot.
* Bot replies user's tweet with an image from commons.
* Bot should not reply to blocked emojis.
* Project codes should have Travis continuous integration testing.
* Code base should be hosted in the Wikimedia organisation GitHub.
* Application/Bot should run on Wikimedia Tool Labs for online testing by community members.
* 300 acceptable emojis and 50 blocked emojis.
* 3000 Wikimedia commons images since an emoji will be mapped to at least 10 images.
* Developers manual on how to contribute which will be a wiki page.
* Project report with full description on how the project was done from beginning to finish.
= Detailed project description =
== Introduction ==
This application seeks to solve a problem of getting corresponding **Wikimedia Commons** images when its emoji counter-parts are tweeted on **Twitter**. Below we will briefly see the meaning of the following bolded words;
* Wikimedia Commons is one of the Wikimedia Foundation's projects that focus on hosting multimedia files such as pictures, audios, videos etc... This acts as a repository of files that can be accessed all over the world and these files are all licensed using various versions of the `Creative Commons` license.
* Twitter is a social media platform for sharing of information (about the happening) world wide using **tweets**. With this platform, information can rapidly spread by people seeing your tweets, and your followers retweeting your tweets etc...
In this project, I am expected to build an application that receives an emoji based tweet on Twitter to a bot that will be called: **Wikimedia Emoji Bot**. The bot will respond or reply to the senders tweet with an image that matches the properties of the emoji. So for example, if the emoji of the sender is a **smile**, the response of the bot will be some image with the person having a smiling property. Another example can be an **angry** and the response would be an image of a person having an angry face. So, they key here is that the image in response should have the properties of the senders emoji. Also, another key part is that the image in response **must** come from Wikimedia commons.
== Implementation Approach ==
===== Creating a Twitter developer app =====
Before creating a Twitter bot, we need to use the Twitter developer website to create an application instance that will the bot will connect to before processing the request and responding to Twitter. The developer Twitter app has various credentials that the bot will use to connect to (which will be configs constants taken from the Twitter app backend).
Example credentials configs are; `TWITTER_CONSUMER_KEY`, `TWITTER_CONSUMER_SECRET`, `TWITTER_ACCESS_TOKEN` etc..
===== Internal structure of the emoji to image data set =====
The implementation approach that will be used in this project is **crowd-working** in which pictures will be selected and loaded up to a JSON file which are mapped to emojis. To visualise the idea, have a look at the code snippet below;
```
{
"emoji-1": [
"image-1"
],
"emoji-2": [
"image-1"
],
"emoji-3": [
"image-1"
]
}
```
This is in a case where there is just one image in the emoji's picture space (mapping from emoji to image) but we can always have one emoji to multiple images mapped to it, we will have a slightly modified JSON such has;
```
{
"emoji-1": [
"image-1",
"image-2",
"image-3"
],
"..."
}
```
We would want to have many images mapped to one emoji to make it more fun and for users to have different images when a particular emoji is tweeted so that the application response all of a sudden won't easily be predictable at different instances.
The key of the JSON structure will be the various emojis and the value will be an array of images which would range from at least 1 image to **n** images where (n >=1). Images will be collected from [[https://commons.wikimedia.org|Wikmedia Commons]] to fill up the image space of an emoji.
===== Flow of the application =====
When a Twitter user makes a tweet to the bot (assuming the bot is alive), the bot will start a process of trying to connect and authenticate to Twitter via the credentials of the Twitter application (created on the developer site) first. This is to make sure that the app doesn't do work (by processing and getting the image in response) before connecting to Twitter and making a response that won't go through thus wasting time. Once there is a connection to Twitter before doing any processing, we are guaranteed that the application will do its process and respond to the tweet with an image.
The essence of the connection and authentication is to make sure the bot is using a valid account and valid credentials to perform the operations and make use of the Twitter API. These credentials created and keys generated from the developer website are unique.
Once the connection is established, the bot will then read the emoji that is sent by the user and parse the file checking to find a match a JSON file of emojis mapping to images and also making sure that the emoji in concerned is not in the list of blocked emojis. Once the emoji is acceptable and there is a match in the JSON file, there will be a `getRandom()` method in the `Image` class to return a random image (image url to be specific) from the image space and store in a variable.
The bot then make a response by replying to the senders tweet with an image that was randomly returned after processing the request. So this image URL will be seen under the tweet of the user, then the user can click on the image and he/she is taken to commons to view the image.
There should be the following feature(s) that is lacking in the NYPL Emoji bot;
* If 2 emojis (smile and angry) are sent, I think it should be implemented in such a way that the application will respond by looping through the emojis and replying. Meaning since there are 2 emojis, the application will send 2 replies which will be pictures from commons with human that is smiling and angry respectively, in order of the emojis.
===== Characteristics of Images =====
* The images that are returned as response should have properties of their emoji counterparts.
= Development Schedule / Timeline =
This is a plan that will be used during the GSoC period but can be slightly modified as GSoC proceeds
- May 4 - May 17 (2 weeks)
-- Analysis on which crowd-working approach to use in the implementation of the bot, as there are several approaches to be used but at the end of the project, the most important in terms of time and space (with greater advantages) should be used.
-- Develop a report on the analysis as part of the project's reporting.
-- Searching APIs and/or libraries that might be useful for this project completion and what they are used for. Also, similar bot and see how they are implemented.
- May 18 - May 30 (1.6 weeks)
-- Discuss with mentors on development strategies (based on analysis) of the project and the approach to use in implementing the Wikimedia Emoji bot.
-- Read docs related to Twitter API in relation to Node JS and understand the NYPL Emoji bot application implementation (code base).
- June 1 - June 10 (1.3 weeks)
-- Build a simple prototype of the application with 5 emojis and 10 commons images.
-- The prototype will implement the agreed approach for the application.
-- Documentation and reporting of the prototype and testing.
- June 11 - June 25 (1.1 weeks)
-- Compile a list of acceptable emojis (as much as possible) that will be used in by the bot.
-- Compile a list of non-acceptable emojis (as much as possible) that will be considered by the bot.
-- Documentation and reporting.
- June 26 - June 30 (5 days)
-- First Evaluations
- June 30 - July 13 (2 weeks)
-- Compile a list of Images that will be used by the bot and making to an emoji should have at least 10 images in its image space. This is to make sure that images hardly repeat in responses when the same emoji is tweeted.
-- Documentation and reporting.
- July 14 - July 23 (1.3 week)
-- Building the JSON structure to map each emoji to its images counterpart and making sure to respect the emoji to image properties.
-- Documentation and reporting.
- June 24 - July 28 (5 days)
-- Second Evaluations
- July 28 - August 10 (2 weeks)
-- Write automated tests to test all various functionalities of the application which will consist of scripts to test; `postTweet()`, `replyTweet()`, `randomImage()`, etc.. All the various components of the application will be tested.
-- Documentation and reporting.
- August 11 - August 20 (1.4 weeks)
-- Deployment of the application on Wikimedia Tool Labs (test instance) so that members of the community can test and give feedback for improvement.
-- Creating of an official Wikimedia repo (in the Wikimedia Organisation on Github) for this project so that other can contribute and development will continue after GSoC.
-- Documentation and reporting.
- August 21 - August 29 (1.1 week)
-- Pencils Down, Code clean up.
-- Improve and review documentation.
-- Final evaluation, Submission of code to Google.
= Work after GSoC 2017 on the project =
This is a long term project and will need to be improved after the GSoC period as more and more images on Commons keep adding (matching emojis) and also more and more emojis keep coming up every single day. So as work after GSoC, emojis (new or old) that don't yet have a commons picture counterpart will be filled up and new features will be added to the bot. And maybe in the future, some AI algorithms will be used to populate the JSON file of emojis to commons images.
= Time Availability during GSoC =
I would be able to offer 45 hours per week on the project. Also, to meet up with the demands of the project, I will be coding during weekends (occasionally) and regularly informing my mentors on my progress on the project and regularly updating my wiki report page or phabricator ticket of the project. I will mostly be programming in the evenings due to school during the day but if working on Saturdays and/or Sundays, I will probably be programming through out (day and night). Of course, this will be adjusted based on the time-zones of the mentors and mine.
= Why Wikimedia Foundation (WMF)? =
Wikimedia Foundation is an organisation which focuses on encouraging the growth, development and distribution of free, multilingual, educational content, and to providing the full content of these wiki-based projects to the public free of charge. This is an organisation worth working with to make my continent (Africa) especially Cameroon to be sensitised about such opportunity of the sum of all free knowledge. This will go a long way to improve my community and Africa including the world as a whole in terms of education and academics "for free". Since the African community is somewhat lacking in the global movement, this is the reason why I decided to forge into the movement, hence filling the gaps.
= My Contributions (technical) to Wikimedia =
- Since I joined the Wikimedia Foundation (WMF) around September 2015 till now, I have contributed in several ways in the improvement of this organisation both in coding, mentoring outreach programs and also in community perspective by being one of the leaders of a Wikimedia User Group in my country.
- In terms of coding and submission of patch sets (PS), I have dozens of cycles of merged PSs. Are you interested in checking it out? Here: [[ https://gerrit.wikimedia.org/r/#/q/owner:D3r1ck01+status:merged,n,z | here ]] and more are still to come :).
- Also in terms of Mentoring, like I mentioned above in my Programming Background section, I mentored the Google Code-In 2015 program and also Google Code-In 2016 under Wikimedia. Currently, I am coaching a team of female developers hoping to participate in the RGSoC program under Wikimedia (as mentoring org).
- My Contributions to WMF Code base - Extensions Worked on
-- Echo Extension
-- Mailgun Extension (co-authoring)
-- MobileFrontend Extension
-- Newsletter Extension
-- Wikibase Extension
-- MediaWiki Core and Wikidata.
= My works (personal and Wikimedia related) =
- I am a big fan of Github and most of my codes lives there. Below is a list of projects I have built and deployed which are running live;
-- [[ http://tools.wmflabs.org/ifttt-testing/ifttt/v1/rss-feeds | Wikipedia RSS Feeds ]]: Part of my GSoC 2016 project to list RSS feeds for various Wikipedia triggers which can then be used in an RSS feed reader for information update.
-- [[ https://github.com/ch3nkula/Wiki-Retweet-Bot | Wiki Retweet Bot ]]: Node JS application to do most recent retweets with my twitter account for tweets matching keyword(s) like; "#wikimedia", "#mediawiki" etc... This application runs on Wikimedia Tool Labs and functioning as expected.
-- [[ http://tools.wmflabs.org/durl-shortener/shortener.php | URL Shortener ]]: A PHP version of a simple URL shortener that is deployed on labs and can be used to shorten URL. Check it out.
- Worked on an Eggdrop bot for our channel #ublab and customised it to suite our needs, adding Tcl Scripts to add more features to the bot. [[ https://github.com/EliteProgrammersClub/ublab_bot | Here ]] are the codes and Documentation on how to run, use and customise the bot was done by me.
Those are just to mention a few but to see more of my projects that I have worked, checkout my github profile (the link at the personal section of this document).
= About me =
- I'm an active member and an organiser for the Google Developer Group in our community (Buea) and have participated in various GSoC(Google Summer of Code) meet-ups organised in our community to sensitise and mentor young, talented and motivated students to contribute to Open Source movements.
- Elite Programmers Club is a club that was founded in our University to teach, enhance and empower programming skills to interested students, and I am an administrator and mentor in the club. [[ https://www.facebook.com/groups/eliteprogrammingclub/ | Here ]] is the link on Facebook.
- #ublab is an IRC channel on Freenode that we use for the above mentioned club (EPC) and we also use it for communication and I am a channel operator in this channel. Basically, this channel is used to bring together students and interested persons in our community to mentor them how to be good software engineers and programmers in the nearest future.
- Holder of a Computer Engineering degree (BEng), specialising in Software Engineering from the University Of Buea and currently and Masters student in Software Engineering.
- I have been contributing to WMF for about over 1.5 years now and here are my [[ https://gerrit.wikimedia.org/r/#/q/owner:D3r1ck01+status:merged,n,z | contributions ]], where I have contributed patches across many different extensions.
- Was a GSoC participant for 2016 where I worked on a project to connect IFTTT API and the Wikidata APIs. The project was successfully completed and here is the project: T118463 and also, here is my proposal: T129016. I am so happy and thankful to my mentors (@Slaporte, @Lydia_Pintscher, @hoo and @Bene) for their great efforts they put in so I could realise this project in the end of GSoC 2016. From time to time, I still do some work on the project (improving it and adding more features and documentation).
- Opportune to obtain a full scholarship to attend a **Wikimedia Conference (WikiIndaba 2017)** this year (2017) in Ghana where all Wikimedians are brought to share ideas and discuss about the future of Wikimedia projects. I was opportune to meet most WMF staffs and other great Wikimedians. We share our experiences and ideas and after the conference, I found myself in an a better path as a Wikimedian.
**My programming skills and qualifications**
* Technical Skills
** Programming Languages: JavaScript (Excellent), JSON (Excellent), PHP (Excellent), Python(Intermediate), CSS (Excellent), SQL (Proficient), HTML (Excellent), GoLang (beginner).
** Software Tools: NPM(Node Package Manager), Secure Shell, Git/Github, Gerrit, Linux OS (Ubuntu) and derivatives, Subversion, Vagrant, Composer, Wikimedia Tool Labs & Phabricator, .
** Frameworks: Node JS (JavaScript Framework), jQuery, Vue.js, Flask (A Python Micro-framework), Laravel (PHP Framework),
* Hour of Code Certified, view certificate here.
* Participated in the ACM ICPC contest in 2014, and here is my membership card.
* Google Code-In 2015 Mentor under Wikimedia, view certificate here.
* Google Code-In 2016 Mentor under Wikimedia, view certificate here.
* Google Summer of Code 2016 student participant under Wikimedia, view certificate here.