@Tgr can you please review this and give any suggestions?
Oct 23 2017
@Tgr can you please review this and give any suggestions?
Oct 6 2017
Thanks @Tgr, it's resolved.
Oct 5 2017
Hi @Tgr !
I have installed Mediawiki( cloned it using git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git) in the vagrant folder but i am still encountering error. Localhost:8080 says "No Wiki Found". Log of Vagrant up command:-
Oct 3 2017
Thanks @Tgr for the suggestions.
I will try it this way.
Oct 2 2017
Hi @Tgr !
Thank you for your comments.
Actually i am facing an issue with the local environment, "vagrant up" command is failing continuously.
It would be really helpful if you can suggest something to correct it. While cloning the repository mediawiki folder isn't getting cloned. I have also tried the troubleshooting steps mentioned here but it is not getting corrected. I have also cloned the mediawiki folder externally and then tried but it is not resolving.
I am trying to sort it out and will update the code as soon as it gets resolved.
Sep 10 2017
I have made a few changes. Please check it here.
I have added a module in WikimediaEvents which basically tracks a button click using GuidedTourButtonClick schema. I am not sure if i am on right track, so after reviewing the code can you tell me whether i am on right path or not? And i was not sure of a few parameters of the schema, so i have marked them as "Unkown" as of now.
Sep 8 2017
@Kamsuri5 these kinds of captchas are useful for small sites, where it is reasonable to assume that the bot trying to solve the captcha will be some kind of automated spambot that crawls the web and finds forms to post to randomly, without having any idea what kind of countermeasures it is going to face. For a large site like Wikimedia, where there is sufficient motivation for spambot writers to test how the form works and adapt their bots to defeat the captcha, it's not that helpful - it's easy to program a spambot to click on a div once you know the div is there.
Yes, a spambot designed specifically for cracking a particular site can break this kind of mechanism.
Actually, i was fascinated by this idea as it will be a good counter till the time the implementation logic is not made public, but it doesn't fit here.
I haven't looked at the invisible reCaptcha up close but I'd imagine it works like the checkbox one, except it uses the click on the submit button, so there is no need for a checkbox.
I have read about Invisible reCaptcha a bit but haven't found any other clue except mouse movements. Yes, it doesn't have checkboxes. I will update here if i find anything more about the functioning of Invisble reCaptcha.
Thank you @awight for the links and explanation.
Sep 7 2017
Apr 27 2017
@Basvb can you check this code https://github.com/kamsuri/Single-Image-Batch-Upload. This does not exactly match with those categories but is able to extract important tags. I have some doubts, if you can connect with me on IRC whenever you are free?
Apr 3 2017
@Mvolz thanks for reminding, i have submitted the proposal.
Apr 2 2017
@Basvb but the upload can be done without wikitemplate as well, the description is the only field required which it fetches from title itself. I will try to do an upload by framing a wikitemplate as well.
- After 25th my internship programme will be finished and my classes will resume.
- My internship will be on weekdays only and will approximately take 22-23 hours a week only.
- According to my daily schedule, I will able to contribute 50 hours a week. But if you want me to specify the schedule only for 40 hours, i will make the required changes.
@Basvb and @zhuyifei1999 i have studied the Commons licenses terms and conditions. I have uploaded images via upload script of pywikibot and as far as i have configured it does not deal with copyrights that is why my earlier images have been deleted and the image which i uploaded with upload wizard of Commons, it asks for copyrights permission so it was not deleted. I have researched a bit and found a few copyright tags https://commons.wikimedia.org/wiki/Commons:Copyright_tags, using which i think we can deal with this problem.
@zhuyifei1999 i have made the questions bold, hope now it will be easier for you to review the proposal.
Apr 1 2017
Mar 30 2017
Alright, I will close the GSOC'17 proposal as this is more descriptive one.
@Basvb Actually i was about to write GLAM users, i have changed it. In deliverables i have written the tasks involved basically to categorize the project, so should i change its heading or keep it deliverables and change the content because i have already mentioned detailed information of each task in the weekly schedule, so it will be repetition. Yes, i am going through licensing terms and conditions.
@Basvb Don't I have to submit different proposals for both the programs? The project schedule is same in both the proposals, it is just different from the perspective of their structures. As outreachy has it's own template. So if one proposal will server for both the programs, I will close one of the tasks.
@Basvb kindly review this proposal as well which is for GSOC'17. I have proposed this keeping all my mistakes in the outreachy proposal. Hope to have your feedback on this. I have included the link to my commons but they have deleted my all the earlier images under Copyrights Act and warned me. I will study their terms and conditions and then will try to upload more.
@Basvb thank you so much for such a detailed and a proper feedback, i have made the required changes. Please review my updated proposal, i hope this might fullfill all the requirements. I hope to have your feedback on this as well.
Mar 29 2017
Mar 28 2017
Regarding images 2 and 3 the process is like:-
- Display all the archives of the selected GLAM.
- Once the archive folder is selected, show its images.
But i got your point, the images could be large in number. I thought that earlier but to a limited range.
Regarding images 6 and 7, but i thought that GLAMs showcase their own work?
Your idea of showing the results of mappings is nice, we can implement it.
Actually i was not aware of the fact that it will land on http://tools.wmflabs.org/.
Thank you for appreciation and I will do the required changes.
@Basvb, @tom29739, @zhuyifei1999 can you please review the mockups i designed for the UI, at : https://github.com/kamsuri/Single-Image-Batch-Upload.
Mar 26 2017
@Capt_Swing: Thank you again for your thoughts on the project. The upcoming structured data is a good point to keep into consideration. For the past years I personally try to use the regular (structured) templates to convey information, these can then be easily transfered into the structured data format. Within the tool we'd have to keep in mind that likely in a few years the metadata-mappings and some other parts should be changed to directly use structured data.
@Infobliss and @Kamsuri5: Very good to hear that you are interested, I'm curious to see your proposals, maybe it is a good idea to plan a (short) IRC session to talk a bit about what ideas we have and to ask questions for each other. Tom and Zhuyifei are regulars in Cloud-Services, I'll try to be there as often as possible as well.
On the specific questions, I'll be moving the microtasks into separate tasks with some more information and allowing us to discuss them in some more depth. For the link/flask: Optimally there will be a simple front end with a drop down list, besides that there will also be a link where the names of the GLAMS + ID can be used to call the tool directly (with a button from the image page at the GLAM). There will be a pre determined name per GLAM, so it's not up to the user to use free text to enter the GLAMs name.
On naming: the idea of the function is that given that you know an ID, title of the work (sometimes not known), a description and the name of the GLAM we create a standard format for the title (If title: title else: description - ID -GLAM name.ext), more on this in the specific task.
On API's: I've seen quite some collections of GLAMs with an API, this is where we get all the information from required for the upload, so GLAMs need to have such an API for us to be able to make a metadata-mapping (potentially scraping or database dumps are an option but let's not focus on that at the start). Two quick examples are http://www.gahetna.nl/beeldbank-api/opensearch/?q=2.24.14.02&count=100&startIndex=1 and http://cultureelerfgoed.adlibsoft.com/harvest/wwwopac.ashx?database=images&search=pointer%201009%20and%20BE=%22interieur%22&limit=10&xmltype=grouped . The second bullet point under Misc is to make a list of some relevant GLAMs, let me see if I can make a small start with that to give a better idea of some relevant use cases, I'll include the relevant APIs
Kamsuri5, Those are indeed the main problem, where for the first point the issue is that often there is a large collection (100.000s of images) which could clutter Wikimedia Commons if all files are uploaded. Doing uploads one by one allows for more post-processing and for the most relevant files to be uploaded only.
Mar 25 2017
Mar 24 2017
Hello everyone, now as the mentors have been finalized i want to take up this project for GSOC'17/Outreachy(round-14). I have gone through the details of the project and have started working on the microtasks.
Mar 21 2017
@Basvb I hope you soon find your second mentor as i am looking forward to contribute to this project in the upcoming GSOC/Outreachy.
Mar 18 2017
Alright tha's your call but actually the errors in the second patch occurred because I was not able to get from your comment that what exactly you are saying. I have contributed a lot of time to this project in past few days to have an in-depth knowledge of the same. This is my first time I am contributing to open source so maybe i was not able to understand a few things. But even after making errors I was the one who stood there to correct then. It is not that I lack in knowledge of coding. I can only request you to reconsider my work and would like to thank you for all your help and time.
Mar 17 2017
- Updated code
Is it fine, or you wanted something else because what I got from your earlier comment was that we will use regex for checking formatting and wgUrlProtocols for checking protocols?
And sir I submitted the autoedit.php patch earlier, you haven't merged that? Was there any problem in that?
Mar 16 2017
- Updated regex to check "mailto" as well
- Updated to check for both formatting and protocols
You mean to say that we should check via both ways that is via regex as well as wgUrlProtocols ?
Okay, this way we will be able to check for common URLs as we are not sure whether wgURLProtocols is working correct.
- Updated regex expression
This regex expression works fine for URL validation. Should i submit a patch having URL validation using this regex and only for http, https, ftp and fttps ?
Mar 13 2017
I have tested the second patch separately but not able to get how to check it in the application, so it would be really helpful if you can guide me a bit on that.
Mar 12 2017
- Updated previous revision
I have checked this includes/PF_AutoeditAPI.php but was not able to check libs/PageForms.js.
.arcconfig file is for Arcanist, to check whether it is in project's root directory or not.
- Updated previous revision
I am sorry for the previous patch, it was not correct.