Page MenuHomePhabricator

[GSoC Proposal 2018] Build a similar to @NYPLEmoji bot for Commons images
Closed, DeclinedPublic

Description

Profile Information

Name: Balaji Ramasubramanian
IRC nickname on Freenode: Balaji030698
GitHub Profile: https://github.com/Balaji-Ramasubramanian
Location: Tamil Nadu, India
Time Zone: UTC +5:30 (IST)
Typical working hours: 8 AM to 10 PM during holidays, 5 PM to 12 AM during college days (UTC +5:30)

Synopsis

Wikimedia Commons Images is the largest collection of free-use images. These images are being part of the Wikipedia, Wikibooks, Wikinews and many other Wikimedia projects. The repository contains millions of images across many categories. To promote Commons images through twitter platform, we have to create a Twitter bot similar to NYPLEmoji bot.

A user can simply tweet to our bot with an emoji and get a relevant image from Commons Image repository in the reply. This way, we can introduce a lot of people to the commons collection since emojis are language neutral. Also, the bot will post a tweet once a day with an emoji with a relevant Commons image. The project consists of the creation of a Twitter bot and relevant Wikidata for the Emojis. WikiData will be the data source for crowd-sourcing the relevant Images for each Emojis.

Mentors: @ArielGlenn, @Dereckson, @D3r1ck01

I am discussing regarding this project with them in Fork NYPL emoji bot for Commons images (to give back 'similar' Commons images)

Deliverables

  1. A Twitter bot which replies with a relevant Commons image when a user tweets it with an emoji.
  2. Curated Wikidata pages with Image property values for the emojis.
Timeline
April 23 - May 14
  • Community bonding.
  • Interacting with other Wikimedia contributors and relevant Wikimedia groups like Cloud Service channel.
  • Acquiring relevant access to my accounts in ToolForge and WikiData.
May 14 - May 21
  • Finalize the name & icon for the Twitter bot
  • Create a Twitter handle and Twitter Developer Account for the bot.
  • Assess and explore the new options in the Twitter Account Activity API once it is out of beta. It is currently in public beta. We can expect the stable version by early May since the developers have to migrate to this new API before June 18, 2018.
  • Create a module to add inline images in the tweets.
May 22 - June 04
  • Create the Wikidata pages for each emojis which contain the list of relevant Commons images under its image property.
  • Decode emoji to text format.
  • Creating a mapping for emojis and the corresponding Wikidata QIDs.
  • Create the query module to get relevant images from Wikidata.
  • Complete a basic version of the Twitter bot.
June 05 - June 9
  • Deploy the initial Twitter bot version on Toolforge.
  • Test it for various scenarios like user tweeting text instead of an emoji, assess the relevance of the images and cases like no existence of relevant image for an emoji.
  • This timeframe is also used to handle the unforeseen issues if any raised in the project during development.
June 09 - June 10
  • Improve the documentation of the project.
  • Prepare for first evaluation.
June 11 - June 15
  • First evaluation period.
June 16 - June 25
  • Implement the feedback from first evaluation.
  • Fix the bugs in the basic version of the bot (if any).
  • Create a scheduler task to post one tweet a day using the Twitter bot.
  • Create a module to extract Commons Images details like license and contributor name.
  • Convert SVG format images to PNG format before sharing them as inline images in Twitter(Since SVG images aren't supported by Twitter cards).
June 25 - July 02
  • Improvise the Wikidata emoji page data curation by adding more images to each emoji’s Wikidata manually. Also, check for the feasibility to automate or semi-automate the process using the image category, title, and other properties.
  • Test the improved Twitter bot for various scenarios.
  • On successful test results, deploy the bot in Toolforge.
July 06 - July 08
  • Prepare for the second evaluation.
  • Improve the documentation of the project.
  • This timeframe is also used to handle the unforeseen issues if any raised in the project during the development of the second phase.
July 9 - July 13
  • Second evaluation period.
July 14 - July 19
  • Implementing the feedback from the second evaluation.
  • If there is more than one image for an Emoji in Wikidata, create an intelligent module to make an image less repetitive.
  • Create a module to check the Twitter cards support for Commons Image's web page (If supported in future) and show Twitter cards instead of inline images in the tweets.
  • Deploy the final beta version of the Twitter bot on ToolForge.
July 20 - August 02
  • Introduce the Twitter bot to the Wikimedia community for feedback.
  • Finding interested community members and provide them enough access to maintain the bot’s Gerrit repository for the long run and ease of maintenance.
  • Create a Teams account in TwitterDeck to maintain the Twitter profile of the bot and add interested community members in the team.
  • Implement the feedback from the community.
  • Measure and improve the performance and resource usage of the bot.
August 03 - August 05
  • Prepare for final evaluation
  • Improve the documentation of the project.
August 06 - August 14
  • Submission of the project for final evaluation.

Flow Diagram of Twitter Bot

Participation

I will be publishing my source code on Gerrit and communicate with my mentors through Phabricator, IRC, and E-mail. If the mentors want to communicate through other means like Skype and Hangouts, I am fine with it.

About Me

I am Balaji Ramasubramanian from Tamil Nadu, India. I am pursuing my pre-final year in undergraduate Computer Science Engineering. I came across Google Summer of Code in a hackathon event. I will be on my holidays from May 15 to June end. After June I will be having college. But, there won't any exams until August. Hopefully, there won't any academically related hindrance for my project. I am applying for Wikimedia through GSoC alone. As of now, I don't have any plans for applying other organizations even in GSoC.
I am using Wikipedia for many years, It is my time to contribute back to the community through my skills.

Past Experience

Facebook Messenger chatbot - Smart Vaccine Reminder:

Improved healthcare services are an integral part of many smart cities solutions. To that effect, I created an open-source Facebook Messenger chatbot project which delivers the vaccine reminder messages to the corresponding parents before the due dates. To make better help parents keep up with the vaccination schedule and the hospitals to provide better services.
It provides a medium of communication friendlier for the user and another part of the project aims to provide a better integration for the hospitals. This project contains the following two modules

  1. Facebook Messenger chatbot for parents.
  2. Google Sheets add-on for hospitals/healthcare services/governments.

Facebook Messenger Bot: A Messenger chatbot using which the parents can be notified about the vaccination days for their kids. Using this chatbot, parents can register their kid details, check vaccination schedule and details about the vaccines. This module will auto-populate the vaccination schedule in the database based on the kid's DOB.
Google Sheet Add-on: Using this Google Sheet add-on, the corresponding service provider can view and modify the vaccination days of the kids.
Check the demo video Here
You can find the project at my GitHub repository.
I have published this project with Apache 2.0 License. So, any healthcare service providers can make use of this project without any restrictions.

UrlShorty Ruby Gem:

Developed a ruby gem with proper documentation to make use of the Google URL Shortener Service. Using this gem, any ruby project can shorten a long URL, expand the shorten URL and get analytics from their Google URL shortener account. During this project development, I gained experience in building Ruby gem and writing documentation for Ruby projects. You can find the Gem page here and source code in my GitHub repository

Also, I have worked on a twitter bot for my hobby project.

Known issues:

  1. Twitter API rate limit on reading mentions through Twitter API is currently 75 requests per 15 minutes. You can refer here. It can be handled by code. It can be handled by code. One of the workarounds is discussed here on Quora.
  2. Unlike NYPL image web pages, Commons Image web pages lack the relevant meta-tags. Hence, Twitter won’t be showing the Twitter cards for the links and the tweets will contain just the links to the images. Twitter cards will provide better visibility to the tweets. But there are few long-running discussions in Phabricator( T63487, T71941, and T157145 ) about supporting Twitter meta-tags in Wikimedia services. There is even a Mediawiki third-party extension called TwitterCards available which wasn't integrated into the core of the Wikibase due to the above-mentioned reason. There is even a commit revert to Gerrit for the Twitter cards support.
  3. Wikimedia Commons homepage has the picture of the day with tweet option. Due to the absence of Twitter meta tags in the Commons image web page, those tweets are not shown as Twitter cards.
  4. SVG images are not supported in the Twitter cards (Refer Twitter Document here ). Hence, we need to convert the SVG images to PNG before sharing them on our tweets.
  5. Twitter is discontinuing their REST and Stream APIs by June 19, 2018, and asked to migrate developers to the latest Account Activity API. NYPL Emoji bot is using an npm module called Twit which supports only the REST and Stream APIs and it wasn't updated for the last 9 months. I have raised an issue at NYPLEmoji GitHub page and Twit npm module GitHub page. Directly forking this project is risky, hence I am planning to do this project from scratch.

Future Deliverables

  • Create a similar Facebook Messenger bot for the Commons Images
  • Post an analytics tweet once a week with the number of tweets on the week, the number of likes and retweets for our reply tweets etc.,

Any Other Info /* To be updated */

To showcase my ability to build this bot, I have developed a basic version of a similar Twitter bot for Giphy website. I had to choose Giphy for this task due to the lack of Twitter cards meta tags and relevant tags inclusion in Commons image currently. You can find the source code of this project in my GitHub repository and the Twitter bot here. I have deployed this bot on my free-tier Heroku account. Please try that and let me know your feedback.

I will make the project code modular in such as way that our Commons images bot can be extended to other platforms like Facebook Messenger and Slack in the future.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 16 2018, 7:25 PM
Balaji030698 updated the task description. (Show Details)Mar 16 2018, 7:38 PM
Balaji030698 updated the task description. (Show Details)Mar 16 2018, 7:41 PM
Balaji030698 updated the task description. (Show Details)Mar 16 2018, 7:45 PM
Balaji030698 updated the task description. (Show Details)Mar 16 2018, 7:49 PM
Balaji030698 updated the task description. (Show Details)
Balaji030698 updated the task description. (Show Details)Mar 17 2018, 6:05 AM
D3r1ck01 renamed this task from Build a similar to @NYPLEmoji bot for Commons images to [GSoC Proposal 2018] Build a similar to @NYPLEmoji bot for Commons images.Mar 20 2018, 8:02 AM
Balaji030698 updated the task description. (Show Details)Mar 20 2018, 1:48 PM
Balaji030698 added a subscriber: D3r1ck01.
Balaji030698 updated the task description. (Show Details)Mar 20 2018, 2:08 PM

The timelike sounds solid, excepted for the publish Twitter Cards tags to Wikimedia Commons part.

The Giphy sample shows the base mechanisms are known, and the known issues are well thought.

The timelike sounds solid, excepted for the publish Twitter Cards tags to Wikimedia Commons part.

The Giphy sample shows the base mechanisms are known, and the known issues are well thought.

Thanks for your valuable feedback. Regarding the Twitter cards, yes I was being over-optimistic. I have discussed it with ArielGlenn on IRC as well. I am looking for the alternate ways to show better tweets. If nothing works out, we can fall back to the last year plan of using inline images with the license and URL details in the tweets. I will update the proposal regarding it soon.

Balaji030698 updated the task description. (Show Details)Mar 22 2018, 5:45 PM
Balaji030698 updated the task description. (Show Details)Mar 22 2018, 6:08 PM

Some comments have been left on your google doc. Check them out!

@ArielGlenn: Thanks a lot, Ariel. I will look into them and update the doc.

Balaji030698 updated the task description. (Show Details)Mar 26 2018, 6:53 PM
Balaji030698 updated the task description. (Show Details)Apr 23 2018, 5:36 PM
srishakatux closed this task as Declined.Jun 5 2018, 11:52 PM