Page MenuHomePhabricator

Allow contributors to update their own details in tech metrics directly
Closed, DeclinedPublic

Description

The purpose of this task is to provide a full identities manager built on top of SortingHat. SortingHat is the technology that manages person identities (including merging of identities for the same real person) and affiliations (relationship with companies and other organizations) in the Grimoire Dashboard. Grimoire is the technology underlying Korma. When completed, this project will allow WMF contributors to manage and update the information about themselves and other contributors in Korma.

Korma is the place where all of the community software development metrics are aggregated. This also provides profiles information (e.g.: information about Raimond Spekking). This information is based on the identification of persons in the different software development repositories, where they may use different identities (e.g., different email addresses). Then, these identities are tracked, and activity in the different repositories is retrieved using MetricsGrimoire. The SortingHat database keeps the correspondence between all those identities and a "unique" (merged) identity, hopefully corresponding to a real person. It also keeps track of affiliations for a person over different periods of time. This allows later tracking activity per organization (considering activity by their affiliates).

Currently, SortingHat inforamation can be managed in three ways:

  • By directly accessing the database, which is cumbersome, requires permissions to the whole database, needs knowledge of SQL and the structure of the database, and is error-prone.
  • By using a JSON file which can be exported and imported from the database using SortingHat. This requires manually editing a JSON file, and then access to the database.
  • By using command-line Sorting-Hat tools, which requieres access to the database too.

None of these methods is friendly to casual users, and all of them require unrestricted access to the database. This project is about creating a more user friendly system for managing the SortingHat database. This consists of:

  • A REST HTTP API, written in Python on top of SortingHat, to access SortingHat databases. This will include authentication, and access control.
  • An HTML5 application, based in AngularJS, to access that REST API from a browser, in a user friendly way.

The REST API will be implemented in Flask or Django:

  • It willI interface the actual actions in the database.
  • It will follow the SortingHat API.

The HTML5 application will:

  • Provide authentication and access control, based on those provided by the REST API.
  • Allow to deal with unique identities, affiliations and countries, including add/remove/modify identities, add/remove/modify organizations, add/remove/modify domains for an organization, affiliate/remove/modify people to organizations, merge identities, merge enrollments.
  • Help to list identities, affiliations or organizations.
  • Provide a capability for searching identitiies, affiliations and organizations.

In addition, it will provide improvements on the current SortingHat heuristics:

  • Improve the current set of unique identities matcher. This is needed to improve the information provided to the final users in the web front-end.
  • Extend the current matching module of SortingHat.

In addition, the system is expected to have two different users: admin and anonymous. The latter would be the link added to all of the profiles found in Korma that will allow users to update their identities. This would a subset of the accessible information by user admin. Thus, the tool will have a simple user authentication process.

URL: http://korma.wmflabs.org/
See Also:

Git Repositories Related to this ticket

angularJS templates - here
flask rest api - here

Report Related to this ticket

Details

Reference
bz58585

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Qgil added a comment.Jul 3 2015, 11:54 AM

The WMF Annual Plan 2015-16 includes a goal related to this task:

Set and monitor code review KPIs for all community-sourced contributions

Solving this task will help identifying "community-sourced" contributions.

jayvdb added a subscriber: jayvdb.

Hello!

End of GSoC is fast approaching. 17 August is "Suggested pencils down" deadline and 21 August is "Firm pencils down" deadline. It is expected that you don't dive into new features which might take longer than two weeks to complete and instead work on polishing up your project, testing thoroughly and getting your code merged into the main branch. I hope this project is almost complete so you can merge it and make it available to everyone as quickly as possible. :)

A few questions (for both mentors and student):

  • Are you confident in completing the project on time?
  • By when do you think you can merge the code, if at all?
  • Are there any major blockers or important missing features?

We are looking for projects which are (nearly) complete to feature on our post on Wikimedia and Google OSPO's blogs (for example: http://google-opensource.blogspot.in/2015/02/google-summer-of-code-wrap-up-processing.html). If you're interested in getting yours up there, hurry up and get this finished!

The hard deadline on getting code merged is September. T101393: Goal: All completed GSoC and Outreachy projects have code merged and deployed by September for details.

We'll be asking the students to demo their projects towards the end of the program as well.

Good luck!

Nemo_bis removed a subscriber: Nemo_bis.Aug 6 2015, 6:56 PM
Niharika removed Sarvesh.onlyme as the assignee of this task.Aug 18 2015, 4:23 PM
Qgil added a comment.Sep 23 2015, 9:01 AM

@Acs @Dicortazar this project is still featured at Possible-Tech-Projects. Do you want to propose it for Outreachy-Round-11 for December-March?

QuimGil removed a subscriber: QuimGil.Sep 23 2015, 9:07 AM
Qgil added a comment.Sep 23 2015, 9:35 AM

This is a message sent to all Possible-Tech-Projects. The new round of Wikimedia Individual Engagement Grants is open until 29 Sep. For the first time, technical projects are within scope, thanks to the feedback received at Wikimania 2015, before, and after (T105414). If someone is interested in obtaining funds to push this task, this might be a good way.

Is there any chance that this project gets proposed for Outreachy-Round-11. I'd like to contribute. I've worked with PHP, Javascript, HTML5, CSS3 and I am quite familiar with Angular framework as well. I've made an app with python flask framework as well. Here's the link to my Github Repository https://github.com/simmimourya1

Is there any chance that this project gets proposed for Outreachy-Round-11.

@Dicortazar and @Acs: Could you answer this, as possible mentors? See Qgil's comment above.

Hi there,

I'm having some discussion at https://phabricator.wikimedia.org/conpherence/200/ about this.

I guess that link is public, so anyone is welcome to participate :).

@Dicortazar, that link is not public. :)

I'm having some discussion at https://phabricator.wikimedia.org/conpherence/200/ about this.

You need to explicitly add participants to allow them accessing that discussion thread

Hi! I would like to this project for Outreachy December 2015-March 2016
I am familiar with data analytics and machine learning. I have worked on a project to detect skin diseases by analysing images.
I am also familiar with Django and Python. I am developing a website for integration of APIs using Django.
@jgbarah could you suggest what I should do next?

Is there any chance that this project gets proposed for Outreachy-Round-11. I'd like to contribute. I've worked with PHP, Javascript, HTML5, CSS3 and I am quite familiar with Angular framework as well. I've made an app with python flask framework as well. Here's the link to my Github Repository https://github.com/simmimourya1

Sorry for my delay in coming here. Yes, you can work on the project. But it is much more in the area of Python than PHP, I have to say... Currently the basic technology is SortingHat, which provides access to identities, merging of identities, affiliation, etc. I'm updating the description to reflect that, and be a bit more precise about the idea.

Hi! I would like to this project for Outreachy December 2015-March 2016
I am familiar with data analytics and machine learning. I have worked on a project to detect skin diseases by analysing images.
I am also familiar with Django and Python. I am developing a website for integration of APIs using Django.
@jgbarah could you suggest what I should do next?

As I've just commented, I'm going to update the description a bit. Let's talk after that. I guess there is probably little need of analytics and the like, the project is mostly about implementing an HTTP API to a MySQL database in Python, and an Angular-based HTML5 front-end to it.

jgbarah updated the task description. (Show Details)Oct 6 2015, 11:09 PM

Proposal for GSoC 2015, by Sarvesh Gupta, moved to this comment, to make the description more clear, while preserving the interesting analysis and planning, that maybe somebody can use as inspiration.

Proposal

Name and contact information

Name: Sarvesh Gupta
Email: sarvesh.onlyme@gmail.com(primary), sarveshgpt1991@yahoo.com
IRC or IM networks/handle(s): s1991
Location: Roorkee, India
Timezone: Kolkata, INDIA, UTC+5:30
Typical working hours: 1pm to 2am untill 20th July, 5pm to 2am after 20th July

Synopsis

This project was thought up for providing a way for contributors to manage their identities, update or make possible changes to their information by providing a web interface above korma technology. Along with this develop a dashboard for admin, so that she can have control over contributor's profile and identities. For now contributor of MediaWiki has to go through a process which consumes more time in updating information to database for updating their profile and also there is no admin control due to unavailability of web interface, so it is a kind of necessary for community to have a web interface which facilitates them to have an easy access to their profile. Moreover, sortingHat is required to implement REST API and provide more precise method for identity matching, so this project also aims to improve as well as implement the sortingHat to flask framework.

What it means to accomplish?

Extends the functionality of MediaWiki-Dashboard

Step 1: Develop an angularJS based UI providing contributors with authentication access to deal with their identities, affiliation.

Step 2: Presenting information to the users i.e. identities, affiliation or organizations.

Step 3: Allowing users to update their information, add new identities and so on.

Step 4: Allow contributors to search for their profile using some sort of identity and to perform actions like sync or merge with their existing multiple identities which will be handled by sortingHat.

Step 5: Work on SortingHat so that it fulfill the requirement of our project.

How it will benefit MediaWiki or Wikimedia projects such as Mediawiki Community Metrics?

  • It will help improving the information about Contributor's activity in tech Community metrics for example here.
  • It make less cumbersome for contributors for submitting and editing their data by providing a web interface.
  • It'll provide an search method for identities to further sync contributor's account.

Possible mentor

Deliverables

The main deliverable will be a working web application for contributors to manage community identity.

Required Deliverables

  • The angularJS technology based web interface of each and every page of the web application required (milestone 1)
  • Flask (python based web-framework) based back-end part with implementation of templates(front-end) part to it (milestone 2)
  • Optimized search facility with autocomplete feature. User can search and filter the data or information (milestone 3)
  • Working and modeling SortingHat so that our project can took full advantage of SortingHat (milestone 4)
  • Implementation of Authentication techniques to our project (probably openstack-Keystone). (milestone 5)
  • Testing and Documentation.

Optional Deliverables

  • An admin dashboard, so that admin can manage/control contributor's activity.

Schedule

Envisioning phase (May 5 - May 24)
  • Remain in constant touch with my mentor(s) and community.
  • Getting myself familiarize with community and development environment.
  • Getting myself familiarize with working of Flask.
  • Study required docs.
  • Fix some bugs along the way and get my hands dirty.
Community bonding period(1 weeks, May 25 - May 31)
  • Further discussion with my mentor(s) and my community about the prototype and its implementation.
  • Make a Roadmap or workflow for development phase
AngularJS implementation (2 weeks, June 1 - June 14) - Milestone-1
  • Thinking of using ngbp repo to get kick-started with AngularJS, Bootstrap
    • Get myself familiar with repo's architecture and implementation
  • Design AngularJS UI, which will be implemented in Flask framework.
    • Methods for Login/Signup, will work on authentication at last.
    • Methods for Contributor's profile view
    • Methods to list the identities, affiliation, etc
    • Prototypical implementation of search tools.
  • This will require use of AngularJS or may be Ajax
Flask implementation part (2 weeks, June 15 - June 28) - Milestone-2
  • Getting myself familiarize with Flask (already familiar with Django, so won't take too much time)
  • Setting up Flask project
  • Implement the back-end
    • Contributors to view her/his profile
    • Updating information and identities.
    • List identities, affiliation
  • In parallel to above step, syncing angularJS to Flask
  • This will require use of Flask framework, python, angularJS
Mid-term evaluation
  • For mid-term, I will be submitting the working model up-till now.
Implement search and filter (1 week, June 29 - July 5) - Milestone-3
  • Investigating search options
    • Discussing with mentor about search techniques that can be implemented, such as Flask libraries aiding to search techniques such as Woosh or Haytack.
    • Understanding the library that needs to be implement.
  • Script for search and filter implementation in which contributor can search for his/her other existing identities.
  • Syncing it with front-end.
  • This will require use of python, Ajax and javascript.
Research Period (1 week, July 6 - July 12)
  • Getting my self familiar with working of sortingHat.
  • Discuss with mentor(s) with the future work on sortingHat related milestone.
Implementing sortingHat (2 weeks, July 13 - July 26) - Milestone-4
  • Improving the sortingHat heuristics, so that it can be implemented to our project.
  • (Not so sure about milestone for now, need to figure out in Research Period)
Implementing Authentication Technique (1 week, July 27 - Aug 2) - Milestone-5
  • Discuss with mentor about existing techniques for authentication (probably openstack-keyston).
  • Getting myself familiar with finalized technique
  • Implementing the technique to project.
Deploying phase (2 weeks, Aug 3 - August 16)
  • Code Review
    • Conduct code review by myself, then mentor(s)
    • Inact information gained by code review
  • Testing
    • Further unit and integration testing
    • Conduct several rounds of testing for real-world users.
  • Documentation
    • Write approach program and functional descriptions
    • Document deployment and testing.
Pencil Down
  • August 17: Soft Pencil down - A weak for final polishing
  • August 21: Firm Pencil down
  • Submission to Google

Participation

Communication of progress
  • IRC channel: I'll stay online on IRC at #metrics-grimoire in freenode, in my working hours.
  • Email: I will make several check points (within 24 hours) for reading emails and replying as soon as possible.
  • Mailing list: metrics-grimoire will be used to communicate progress
  • Blog: Though I've need started blogging, I would surely try to maintain a blog throughout the project.
Publishing Source code
Where I would turn for help?
  • Solve by myself: Read documentation, search online, etc
  • Seek helps from community: Discuss on IRC, mailing list or mentor(s).
  • Seek helps from outside the community: My work has a lot to do with Flask, authentication techniques, so can turn to different communities for help.

Amenities: I understand, there have been power and internet issues with students from India in past. I've a stable and always-on internet connection and live in a housing society with 24-hours power backup, so that will not be a problem.

About you

I'm Sarvesh Gupta, third year student at Indian Institute of Technology - Roorkee with major Computer Science. I enjoy trying out and learning new things related to web technology.

I love coding and experienced with python, Django, php, javascript, jQuery, mysql, coffescript, MVC architectures and I always do keep working on some project, here's my Github profile.

I want to go for organisation which is mostly used by people. Also I would feel better is my project could work beyond GSoC and used by as many users as possible.

Lastly, I promise to work for at least 40 hr. per week. No other obligations interfere throughout GSoC project.

Past experience

I’ve worked for Oregon State University Open Source Organization in last summer during GSoC-14, here is the link of abstract of work done to last year’s GSoC.

My github link can be found here

Wikimedia will be my third encounter with Open Source. I've fix some bugs(#1, #2, #3) for Mozilla organization related to Automation and Tools. My commits can be seen here with username 'sarvesh-onlyme'.

For now, I'm planning to contribute for MediaWiki through GSoC.

Links

jgbarah updated the task description. (Show Details)Oct 6 2015, 11:44 PM
jgbarah updated the task description. (Show Details)Oct 7 2015, 12:14 AM

@jgbarah could you suggest what I should do next?

I would start with T114838: Microtask: Create a very simple REST API for SortingHat, which shouldn't be too time-consuming.

Is there any chance that this project gets proposed for Outreachy-Round-11. I'd like to contribute. I've worked with PHP, Javascript, HTML5, CSS3 and I am quite familiar with Angular framework as well. I've made an app with python flask framework as well. Here's the link to my Github Repository https://github.com/simmimourya1

If interested, I would start with T114838: Microtask: Create a very simple REST API for SortingHat, which shouldn't be too time-consuming.

I'll start with T114838: Microtask: Create a very simple REST API
Thanks.

Andrew removed a subscriber: Andrew.Oct 7 2015, 4:36 PM
01tonythomas added a subscriber: 01tonythomas.

I am shifting this to Outreachy-Round-11 as the project description has atleast two mentors, micro-tasks and looks ready for the 11th edition of Outreachy ( Dec 2015 - Mar 2016 ) . Potential candidates should start by submitting their proposals as a blocker for this task, by November 02.

Feel free to revert it back, if this task has some relevant issues which might block its completion in this term of Outreachy.

Hi @Aklapper, we've just included some changes in our affiliation process. Before starting to match identities we get a JSON file that you can modify in a private repo. After sortinghat (the tool who handles the identities imports this data and get the new people) finishes we export again that file to the private repo. In order to give you access to that file, please send me via email your github user.

My email is lcanas at bitergia.

(this is @Lcanasdiaz from a different account due to login issues)

I am shifting this to Outreachy-Round-11 as the project description has atleast two mentors, micro-tasks and looks ready for the 11th edition of Outreachy ( Dec 2015 - Mar 2016 ) . Potential candidates should start by submitting their proposals as a blocker for this task, by November 02.
Feel free to revert it back, if this task has some relevant issues which might block its completion in this term of Outreachy.

Fine with me. Thanks.

01tonythomas added a comment.EditedOct 28 2015, 8:25 AM

@jgbarah : are you planning to mentor https://phabricator.wikimedia.org/T89135 - 'Improving MediaWikiAnalysis' too in this program, as it still have your name in the Primary mentor list ?

Hello! I would like to work on this project for Outreachy Round 12 and I have also started with the first microtask T114838: Microtask: Create a very simple REST API for SortingHat. I wanted to know how to proceed further, i.e., where to send the link of my github repo of the first microtask.

Hello! I would like to work on this project for Outreachy Round 12 and I have also started with the first microtask T114838: Microtask: Create a very simple REST API for SortingHat. I wanted to know how to proceed further, i.e., where to send the link of my github repo of the first microtask.

Great to know about your interest @Kurisutina24. I am not sure yet if this task would be Featured for GSoC/Outreachy this round, but since it is still in https://phabricator.wikimedia.org/tag/possible-tech-projects/ - you are welcome to understand the problem, and come up with a proposal.

You can find the lifetime of a GSoC/Outreachy project over here - https://www.mediawiki.org/wiki/Outreach_programs/Life_of_a_successful_project, and you should be updating the microtask, once you have done something relevent with it. In case you have some questions about it, please ask directly in the relevant micro-task ( and not in the project task ).

Sumit added a subscriber: Sumit.Feb 19 2016, 8:16 PM
NOTE: Outreachy round 12 applications are now open and GSoC 2016 is round the corner. This project was featured for Outreachy round 11 and has a well defined scope. Are you ready to mentor the project this season? If yes, then we'll feature this for Outreachy round 12 and GSoC 2016 as well. Please reply back in comments.
Niharika removed a subscriber: Niharika.Feb 20 2016, 5:20 AM
Qgil removed a subscriber: Qgil.Feb 22 2016, 9:31 PM
Sumit added a comment.Mar 2 2016, 1:57 PM

@jgbarah , @Dicortazar , @Acs, are you ready to push this project in this round of GSoC '16/Outreachy-12 ?

This task do not have any confirmed mentors for GSoC'16/Outreachy'12 yet : The administration team is moving this project to ( Missing Mentors ) list as we do not have any confirmed mentors for this round yet. Interested in mentoring ? Do add your name in the task description. A Possible-Tech-Projects task requires a minimum of one primary mentor and a co-mentor to be featured for GSoC/Outreachy. Prospective students ? Do take a look at the Wikimedia mentors pool at https://www.mediawiki.org/wiki/Outreach_programs/Possible_mentors, and try connecting this project with a mentor, to get featured for this round.
Sumit added a comment.Sep 14 2016, 7:05 PM

This task featured in GSoC 15', but does it need further work? whats the current progress? Is this still a Possible-Tech-Projects candidate that we can have it for an Outreachy round? Some clarifications would be great!

Aklapper lowered the priority of this task from Normal to Low.Nov 23 2016, 7:11 PM
Aklapper lowered the priority of this task from Low to Lowest.Jan 6 2017, 1:25 PM

Note: The task description is very outdated and links to technology and websites that we do not use anymore.
More important though: It is unclear to me how and why a random person would be motivated to edit "the information about themselves and other contributors" (like what exactly? Maybe the name and affiliation, but what else is even relevant?) and who/how those changes would get verified.

Aklapper changed the task status from Open to Stalled.Feb 18 2018, 7:43 PM

Note that Bitergia work on Hatstall so this task is blocked on T157898 (until technology is in place).
After it is in place the question would still be how someone who wanted to update their own details would identify themselves.

Removing Possible-Tech-Projects as we are planning on killing that workboard soon. In its current state, it is not a good fit for Outreach-Programs-Projects either..

Aklapper closed this task as Declined.Jul 11 2018, 9:55 AM

I do not see any reason to implement this hence declining this task.

Restricted Application removed a subscriber: Liuxinyu970226. · View Herald TranscriptJul 11 2018, 9:55 AM