Page MenuHomePhabricator

[GSoC Proposal 2017] Build a similar to @NYPLEmoji bot for Commons images
Closed, ResolvedPublic

Description

My Name Harsh Shah and I am contributing to building Bot fo Wikimedia commons images.

Personal info and past contributions

Google Summer of Code 2016 Experience

I was selected as the GSoC 2016 student in KDE working on the Kopete: IRC Protocol.I had to Implement the IRC protocol plugin for Kopete in KDE but some of Family Reason in a midterm submission I stop working on it.I made the architecture for the IRCprotocol engine to implement for Kopete and write one patch for that also submit one patch and complete first face of my project Timeline work.Project Repo: https://github.com/harshcrop/kopete

Personal Details

● Name: Shah Harsh Amitbhai
● Email: harshcrop@gmail.com
● IRC nick: harshcrop
● Gitter: harshcrop
● Github: harshcrop
● Country of residence: India
● Timezone: India (+5:30)
● Primary language: English, Hindi

Primary Mentor: @Dereckson and @ArielGlenn
Support: @Yurik for JavaScript/Node-specific issues

Abstract

Wikimedia need to use their “Wikimedia Commons" under the license like CC BY or CC BY-SA,
or public domain.So they come up with the creative idea for make re-use of images with Bot.
Now are days bots are new tech part of technology so wiki wants to develop twitter bot like
@NYPL Emoji Bot.

People who tweet an emoji to @NYPLEmoji get a similar image from the collections in
response. The proposed project is about creating a similar bot for Commons images. It will
introduce a lot of people to the Commons collection and because it uses emoji, not words, it
would translate for mobile/desktop users worldwide.

Current Project status:

Current status of the project is to build New “Twitter Bot for the wiki common images“ as
compared to the NYPL emoji bot.
My small Question about this project is that how to match emojis and image and how many
emoji we have to cover in the GSoC period.

Goal

  1. The propose of the project is creating a similar bot for Commons images. The goal is It will introduce a lot of people to the Commons collection and because it uses emoji, not words, it should translate for mobile/desktop users worldwide.
  1. The goal is to provide a fun way to promote our Wikimedia Commons media catalog, all under free licenses like CC BY or CC BY-SA, or public domain.
  1. The main goal of the project is that like we take an example, people send a flower emoji, in reply they get a rose photo. People send a smiling emoji, they get someone sin reply.Interesting logic, and random results, users will have more fun with the bot.

Timeline

GSoC is about 12 weeks duration, with 25 days of Community Bonding Period in addition.
I will be spending 40% time on adding features to the Twitter Bot.
35% time on enhancements in the Bot.
15% time on performance improvement of the Bot.
Remaining 10% time on fixing the bugs in the Emoji Bot.

  1. Community Bonding (May 5 - May 29): Getting in touch with the mentors. Bonding with the other members in the project. Will take suggestions from the mentors and other community members. Planning for how to start the work on the project and discuss emoji to image query solution.
  1. Phase I (May 30 - June 15): Implementing JavaScript part. Working on the bots framework. Integrating to handle the server or data side. Setting up. Twitter Page and New App for the bot implementation.
  1. Phase II (June 16 - June 30): Implementation of Timeline in the Bot Developing.During this period I will implement all Twitter App & Bot framework part also try to get the connection between them.
  1. Phase III(July 1-July 15): In This Phase, I start making bot script like request file, replay and status js files for the twitter bot also making a script for image collection and twitter key configure file.
  1. Phase IV(July 16 - July 30): Start to work on API data convert into JSON data for the bot. Using the Wikimedia common API sandbox. This will help to get data of wiki commons images.
  1. Phase V(July 31 - 10 August): I will spend this period to improvise the bot reply and fetch data interface. Adding few easy recognition. Take some feedback from the community. Working on the bugs and other function as well. Performance improvement will be implemented in this period.
  1. Pens Down: Will work on the bugs and other issues reported by the community member. Improving the load time of the app.

Other

Also contributed in various Open-source Repo. I have contributed in KWOC(Kharagpur
Winter Of Winter) project. I have work on a lot of projects. Work like in MIT Media lab
Innovation workshop (Pee Racers), I work with Pycon APAC 2016 to help them to promote
Pycon in India and I was the only Indian boy they selected. I also work as a freelancer and
intern in several local companies.I have been using various languages (Javascript, python) and
environments (WordPress, ERP project, IOT project). I also made 3d Printer from computer
waste and Recycle CD-DVD Drive and programmed some microcontroller.

Currently working as a full-stack developer in Ahmedabad based company part time.Also,
coach in RGSoC (Rails Girls Summer of Code) this year.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Sumit added a subscriber: Sumit.

@harshcrop please mention names of your mentors and add T143593 as parent of this task.

This is an assigned subtask hence not under Possible-Tech-Projects or under #Outreach-Program-projects (which is only the parent task)

harshcrop updated the task description. (Show Details)May 9 2017, 4:46 AM
harshcrop added a subscriber: Yurik.

This seems to be a ticket for a specific bot, rather than a bot framework issue that can affect several bots. I've removed the Bot-Frameworks tag from this, since I don't see what framework-level issue this addresses.

Headbomb removed a subscriber: Headbomb.May 15 2017, 2:30 PM
Dereckson triaged this task as Normal priority.May 23 2017, 12:00 PM

Making a page for data gathering listing emoji to it's categorized Wikimedia common image link.
page link: https://meta.wikimedia.org/wiki/User:Harsh_crop/emojitable

Make Github repo for my GSoC project link https://github.com/harshcrop/wikiemojibot setup all the bot code. The project is open source anyone suggest me any issues or anything I can update.

Dzahn removed a subscriber: Dzahn.Jun 2 2017, 5:01 PM

Follow all the Weekly Report Updates on T166861

This is just the rough architecture that I put my idea on paper.
you also put some idea to help me get to build me a better software

Sorry I missed you on irc! I was here but did not notice the pings. What results are your api queries giving you? Do you really intend to walk through all images on commons in order to find information about the one(s) you need? Because that is what 'allimages' does, have a look here: https://commons.wikimedia.org/w/api.php?action=help&modules=query%2Ballimages

What else have you been working on? I didn't see updates to the repos so I am guessing there is no new code yet?

We talked about having a little bit of test data manually entered in the table for testing. How is that going?

Making this testing json

{
"😀":
["https://commons.wikimedia.org/wiki/Category:Laughing"],

"😁":
 ["https://upload.wikimedia.org/wikipedia/commons/4/40/Lifeless_Face_022.jpg?uselang=en-gb"],
 "😂":
 ["https://commons.wikimedia.org/wiki/Category:Tears_of_joy_and_sorrow?uselang=en-gb"],
 "🤣":
 ["https://commons.wikimedia.org/wiki/Category:Laughing"]

}

harshcrop added a comment.EditedJun 17 2017, 11:38 AM

facing a new issue in code bot did not reply to the emoji Twitter to the person trying to slove it.

Does it send a tweet at all? Does it see the emoji sent to it? Have you tried adding debugging statements and writing them out to a temp file to see what it's doing?

If you upload your current code as a pull request, I can help you to debug.

means what kind of pull request you want to pull me??

If you've still uncommitted code you want feedback on, or help to debug,
you can create a branch, then commit the work in progress. Then you push
the branch to your repo. GitHub will offer you to create a pull request
master..your branch, is with the difference between the master branch and
your branch.

facing a new issue in code bot did not reply to the emoji Twitter to the person trying to slove it.

@harshcrop: Was your comment "just for our info"? Or did you actually ask for help between your lines but do not explicitly say so? Explicit communication is welcome to avoid misunderstandings and followup questions for clarification. :)

@Aklapper I am asking help for this the status part is working perfectly but when we tweet the bot is not responding with reply message in that part I need help.

harshcrop added a comment.EditedJun 20 2017, 11:45 AM

This the terminal output that what I am sharing here for better understanding.
sharing a screenshot of npm run reply command that not working

(For future reference, please post text as text so it can be indexed by search engines, and just add a line like "This is all the output on the terminal" or such. Thanks.)

Deploy Bot on Heroku I choose this for the reason that this is a perfect solution for the Bot right now I also read and learn about the wiki tools but I familiar with Heroku soi go with this.

Also, I am adding scheduler script in Heroku for the scheduled tweet for Bot and also for the reply thinking some "worker" process.

harshcrop added a comment.EditedJun 26 2017, 6:22 AM

Things Completed Before (26 June)

  • Github Repo of wikiemojibot
  • setup and fork the Codebase of @NYPL and make changes according to the Need In Project
  • Also, set up the Table page for emoji and commons images link
  • Twitter App created also create the twitter page you can follow on @wikiemojjibot
  • Also, make Convert javascript code for converting table to JSON data check out Repo
  • Finally, Put the Bot on the Heroku server
  • Make some small data for the test Purpose
This comment was removed by harshcrop.

Make Some logos for WikiEmojiBot
#GiveReview which one is better

#Issue of Heroku server is fixed the problem was "vars configure" finally figure out where is the problem and fixed Worker is finally up and status command is run perfectly thanks for proper guiding @Dereckson and @ArielGlenn

Also sharing screenshot of Heroku Dyno

Make Some logos for WikiEmojiBot
#GiveReview which one is better

The first logo I think is a bit boring. But tell me more about what the logo represents, how you chose these designs.

I think simple for logo something connecting to emoji and Wikimedia. I make this logo using "canva" it is an online platform you can make a logo from scratch any lots of ready templates are also there.

@ArielGlenn you have any idea in your mind share it I try to make new on on that.

The black and white logo hides emojis more like the regular emoticons, and doesn't make them visible at first look ; when looking the logo a more time, they finally pop.

The second logo is visually nice, but it doesn't convey the emoticons meaning, the tree makes me think about "organic data to analyze", the foliage being a graph data structure.

I think simple for logo something connecting to emoji and Wikimedia. I make this logo using "canva" it is an online platform you can make a logo from scratch any lots of ready templates are also there.
@ArielGlenn you have any idea in your mind share it I try to make new on on that.

I don't have anything in mind myself, it's your bot, I wouldn't presume :-) But something that makes me think of both emojis and free media in some stylized way, that's what I would go for if I knew anything about design.

#Discusstion

Que: I am using Wikimedia Common Images for My GSoC Bot project so, I need help to understand license issue or How can I provide license information into Bot?

That is a good question. What are all the possibilities? Maybe you could list some and we can go through them together to see which ones are more feasible.

harshcrop added a comment.EditedJul 1 2017, 12:27 PM

After reading license policy of Wikimedia common images I find out.

  1. Wikimedia Commons only accepts free content, that is, images and other media files that are not subject to copyright restrictions which would prevent them being used by anyone, anytime, for any purpose. The user may, however, be restricted by issues not related to copyright, though, see Commons: Non-copyright restrictions and the license may demand some special measures. There is also certain material, the copyrights of which have expired in one country while still applying in another. Some of the details are explained below. Wikimedia Commons tries to ensure that any such restrictions are mentioned on the image description page; however, it is the responsibility of reusers to ensure that the use of the media is according to the license and violates no applicable law.
  1. All copyrighted material on Commons (not in the public domain) must be licensed under a free license that specifically and irrevocably allows anyone to use the material for any purpose; simply writing that "the material may be used freely by anyone" or similar isn't sufficient.

3.In particular, the license must meet the following conditions:

Republication and distribution must be allowed.
Publication of derivative work must be allowed.
Commercial use of the work must be allowed.
The license must be perpetual (non-expiring) and non-revocable.
Acknowledgment of all authors/contributors of a work may be required.
Publication of derivative work under the same license may be required.
For digital distribution, use of open file formats free of digital restrictions management (DRM) may be required.

These three things I understand after reading it.

  • I want to know about more on this two-point I mention below **
  • Acknowledgment of all authors/contributors of a work may be required.
  • Publication of derivative work under the same license may be required.

my point is we can negotiate this points that we do not need to add anything or we can add this thing to mention acknowledgment of authors/contributors.

@harshcrop Some licenses may not require anything. For example, an image released under public domain requires nothing at all; you can reuse the image as you see fit, without even giving credit. Of course, it's nice to give credit ayways, but it would not be required.
The default license for uploaders I believe is CC-BY-SA. You could check this by trying an upload on Commons if you have an account, just stop at the license selection page if you don't really want to upload the image. That license requires attribution and a reference to the license, so that viewers of the image know they have rights to share and re-use.

You'll want to see what is considered "good enough" to comply with the CC-BY-SA license. There is probably a policy page on Commons that talks about it.

For images display in twitter link, i read the documentation on Twitter Support link.

How to post a link in a Tweet on the web

  • Type or paste the URL into the Tweet box on twitter.com.
  • A URL of any length will be altered to 23 characters, even if the link itself is less than 23 characters long. Your character count will reflect this.
  • Click the Tweet button to post your Tweet and link.

I do this but it's not displaying image.so use upload link hope it's work.

The docs you mention above are about how to post a link, but you want more than that. You want an image preview from the link, to be displayed in the twitter feed. I would look for that information and see what turns up.

harshcrop added a comment.EditedJul 11 2017, 2:06 PM

#new issue display twitter card images

  1. As per twitter document link for creating Twitter cars for images we have to follow this

Twitter document

For that we need to add meta tags

<meta name="twitter:card" content="summary" />
<meta name="twitter:site" content="@nytimesbits" />
<meta name="twitter:creator" content="@nickbilton" />
<meta property="og:url" content="http://bits.blogs.nytimes.com/2011/12/08/a-twitter-for-my-sister/" />
<meta property="og:title" content="A Twitter for My Sister" />
<meta property="og:description" content="In the early days, Twitter grew so quickly that it was almost impossible to add new features because engineers spent their time trying to keep the rocket ship from stalling." />
<meta property="og:image" content="http://graphics8.nytimes.com/images/2011/12/08/technology/bits-newtwitter/bits-newtwitter-tmagArticle.jpg" />

like this.

Also, Twitter’s crawler respects Google’s robots.txt specification when scanning URLs. If a page with card markup is blocked, no card will be shown. If an image URL is blocked, no thumbnail or photo will be shown.

For example, here is a robots.txt which disallows crawling for all robots, except Twitter’s fetcher:

User-agent: Twitterbot
Disallow:

User-agent: *
Disallow: /

we also check the link it is valid for Twitter card link or not https://cards-dev.twitter.com/validator on this.

For Solution, we can use any NPM package if available I search for that all are not working now so this option is not available for Bot

Also, sharing screenshot of Card link validator

This screenshot of Wikimedia Common images link

This screenshot of @NYPL images link

I also pushed JSON file with 400+ images link with emoji still now!

Help Needed!

I need Amazon account for putting my BOT and I want to test theirs because of facing some issues in Heroku in Dyno.

I finished my student amazon credit that why I need someone account to test my Bot.

D3r1ck01 added a subscriber: D3r1ck01.EditedJul 11 2017, 2:54 PM

You can request access to Wikimedia Tool Labs and use their cluster services to test your bot 😊, check here: https://tools.wmflabs.org

@harshcrop: Not sure what an "Amazon account" is (do you mean AWS?). Are there specific reasons to not use Tool Labs? See https://tools.wmflabs.org/ :)

I am using Wikimedia Tool Labs now IRC helping me lot.

I got Wikimedia Tool Labs access the accept my membership now i set up my Tool account

harshcrop added a comment.EditedJul 11 2017, 6:45 PM

I complete my setup on Tool lab created my Unix user: tools.toolname

Unix group: tools.toolname

also reading Tool Labs/Kubernetes document for understanding and put my app there.

Need Help !

How to convert Wikimedia Common Images links as per Twitter card images?

I search and mention all the things in above Tickets

also, follow this links for understanding how i can use link https://phabricator.wikimedia.org/T63487

Bug in Wikimedia common images Bug link

#Issue Twitter Card image

As per my last conversation with my mentor @ArielGlenn on this twitter card images problem for Wikimedia common images so I got some refer from wikiCommon IRC for this here is the account who use common images in that tweet https://twitter.com/aviationcommons

So ping that accounts for help I got answer I am sharing screenshot of that here

For posting image in the link what I can try this https://dev.twitter.com/rest/reference/post/media/upload this help upload media images into Twitter but the version 1 API I no longer is use and Version 1.1 have more parameters and complex now.

In POST media/upload in Twitter now I have to try OAuth-enabled curl for the Twitter API tool called "Twurl"

So, as per discussion out today meeting at IRC, we found out that XAuth is close by Twitter and also the Twurl does not continue by twitter also this idea is a not working for the Twitter card images.

Idea for the Twitter card images use URL as thumbnail URL for that I refer this link

I am using thumbnail URL of Wikimedia common images Example: https://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Albert_Einstein_Head.jpg/220px-Albert_Einstein_Head.jpg

Example: Get the URL for a 220-pixel-wide thumbnail of File: Albert Einstein Head.jpg from Commons.

api.php?action=query&titles=File:Albert%20Einstein%20Head.jpg&prop=imageinfo&&iiprop=url&iiurlwidth=220

Output:(screen shot)

I try this thumbnail URL into Card validator but the same result I find out I am sharing screeenshot here

We talked about this on IRC just a few minutes ago; these links are meant for retrieving scaled images. They don't have meta tags for twitter; nothing on Commons does.

As per our discussion in Meeting my task is to write code for twitter upload code I write a script for it as per twitter post/media

Code file

I try one crazy thing I create folder that I can store images and relate that path to emoji in JSON File but it didn't work out tweet creating error like " try again to tweet"

my upload.js twitter media code is not working I am debugging for it but I find out on twitter that I need every file at one place and make connections with as with the twitter upload POST/Media API 1.1. Also, having some issue in the test case with the upload.js file so remove it for now .

harshcrop added a comment.EditedJul 20 2017, 5:20 AM

upload media with twitter API require store media at one place and can like that folder of images with the code

const PATH = path.join(__dirname, path/dirname);

so the downloading all the Wikimedia common images is not cool idea from my side storing almost 3000+ emoji link images and for future plane if we have to add more so there is part I am concern about it and like is more give the valuable credit to the uploader and also the well use of the Wikimedia common images .

Also, there is one example I found not for images but for video and he is also mentioned in it that you have to store the file on the server or locally.

Example link: http://lorenstewart.me/2017/02/03/twitter-api-uploading-videos-using-node-js/

Besides this work I am pushing more JSON data to add more links with emoji

Heroku scheduler is working fine problem fix

harshcrop added a comment.EditedJul 23 2017, 7:03 PM

Finally, The Twitter upload image problem after trial and error writing script my first API script for twitter thanks to mentors encourage to try by my self.

After so many trials and reading lots of Twitter developer forms and documentation (i am lazy in reading documentation but I change my habit for now)

so, for that, I use Twitter OAuth 1 access token approach

My first approach: Twurl (Twurl is like curl for Twitter API)

install command: $ gem install twurl

after that use authenticate command

$ twurl authorize -u username -p password\ --consumer-key key --consumer-secret secret

After that, you get URL to paste in the browser you get PIN access to API authentication.

Next command for POST/Media file

$ twurl -H upload.twitter.com "/1.1/media/upload.json" -f ~/path/to/filename -F media -X POST

output:

{"media_id":889199191934808065,"media_id_string":"889199191934808065","size":92485,"expires_after_secs":86400,"image":{"image_type":"image\/jpeg","w":1366,"h":768}}%

Next command:

$twurl "/1.1/statuses/update.json" -d "media_ids=889199191934808065&status=Hello"

final output:

https://twitter.com/wikiemijibot/status/889092448173084672

script:

var Twit = require('twit')
var fs = require('fs')

var T = new Twit({

  consumer_key:         'YOUR_CONSUMER_KEY'
, consumer_secret:      'YOUR_CONSUMER_SECRET'
, access_token:         'YOUR_ACCESS_KEY'
, access_token_secret:  'YOUR_ACCESS_SECRET'

})

var b64content = fs.readFileSync('/path/to/img', { encoding: 'base64' })

// first we must post the media to Twitter
T.post('media/upload', { media_data: b64content }, function (err, data, response) {

// now we can reference the media and post a tweet (media will attach to the tweet)
var mediaIdStr = data.media_id_string
var params = { status: 'Tweet with a photo!', media_ids: [mediaIdStr] }

T.post('statuses/update', params, function (err, data, response) {
  console.log(data)
})

})

Good work @harshcrop. Keep it up. :)

@harshcrop Great to see your emoji bot making progress :) One thing you might want to consider before commenting on this task. Collect your notes elsewhere first, summarize them and then post them all together here. That way you will have fewer comments, and it might be super easy for a subscriber like myself to follow who know nothing about this project but really interested in staying up to date :)

I upload my upload.js file in a new branch (test) my code is live but has some errors I am debugging the code for that merging with BOT script is my task now.

@srishakatux okay, I will now post summarize work here. But why I am posting here because people know what I am doing in my project and if I am stacked anywhere in my project so they can help me out I can get their suggestion here.My mentor told me to write each thing on ticket this thing help to make note and I have all the document at one place

@harshcrop Great to see your emoji bot making progress :) One thing you might want to consider before commenting on this task. Collect your notes elsewhere first, summarize them and then post them all together here. That way you will have fewer comments, and it might be super easy for a subscriber like myself to follow who know nothing about this project but really interested in staying up to date :)

We've actually been encouraging him to comment more often for the reasons he mentioned above. Is there another place where he could summarize? Would the weekly blog posts suffice for that?

I write my blog also check out on http://harshcrop.me/blog

So I saw an image in the wikiemoji bot tweets, very exciting! Now we just need you to have the code in your repo that does it, so that you can start up the bot on heroku and let it run for awhile, and we can play with it, and also look at the repo before the evaluation period ends on Friday the 28th.

A side note: don't forget that your commit messages in the repo are not just for you, they are for you in a year when you forget what changes you made and when, and they are for other people who have no idea about your code. So, say a lot more, like, instead of just "test", what are you testing? Instead of just "update files", what are you updating and why? Instead of "modified worker", what change did you make and why did you need to make it?

Also, thanks for the blog post!

sorry I am just testing the code that why I put this commit messages but you are right I change it, for now, it does not help to other people my mistake.

Hi! Is there anything remaining in this task before it can be resolved? thank you!

This task was not completed. We need the bot to start up automatically and authenticate properly without manual intervention, and to display images inline (preview) on Twitter; there also should be a community-maintained table mapping emoji to images. Or some other mechanism for that mapping should be proposed and used.

D3r1ck01 renamed this task from Build a similar to @NYPLEmoji bot for Commons images to [GSoC Proposal 2017] Build a similar to @NYPLEmoji bot for Commons images.Mar 20 2018, 8:04 AM
Dereckson closed this task as Resolved.Mar 22 2018, 1:53 PM

Mark as resolved as the proposal is now done and there is no more actionable on this task.