Page MenuHomePhabricator

Proposal: [#1Lib1Ref] Build a "worklist" tool for campaigns and in-person editing events.
Closed, DeclinedPublic

Description

Profile Information

Name: Aditya Jain
IRC nickname on Freenode: AdityaJ
GitHub profile: https://github.com/Jain-Aditya
Location (country or state): India (UTC +5:30)
Typical working hours (include your timezone): between 11 am and 6 pm UTC +5:30 (will give more time if required)

Synopsis

The project aims at building a tool that facilitate collaboration on worklists of articles (the articles in a single worklist share a common idea) that could be used for campaigns, in person editing events or other similar activities. The tool imports worklist using PetScan queries and allows the client to share them with other users. Every client can view a real time information about the statuses of articles (If they are being claimed by some other users and whether they're being worked on or already done).

Detailed Approach:

  1. Use MySQL database to store the worklists and the list of articles along with their statuses (Status being whether the article is free, claimed or completed). We will also store the user information who is working on which article.
  1. Use Mediawiki OAuth for authentication.
  1. The worklist will be associated with a PetScan query ID (which will be entered by the user one time while creating the worklist).The import API will use this ID to fetch the list of articles and dump into the database.The worklist table will also contain a ‘created_by’ column (which will be the mediawiki username) which can be later used for restricting certain actions only to the worklist creator.
  1. Since the PetScan results can change over time, we will be having a background scheduler (implemented using CRON job) which will call the import API at certain intervals of time to get the updated list of articles. There will be cases where some articles get missed in the new PetScan response. For the initial phase, we will simply consider the new list of articles and discard the old ones corresponding to a worklist. (The statuses of the articles will not be lost given the article ID is same). However, in future we could think of something like history of worklist by preserving the previous state as well. Additionally, we can give a refresh button on the worklist page to fetch the updated list when desired. The manual refresh can only be done by the worklist creator to avoid too much of unnecessary load on the server.
  1. Once the worklist is created, the link will be generated which can be shared with other users. The create API will return the newly created worklist ID and javascript will simply append that ID to the worklist detail url something like http://worklist-gsoc.org/worklist/1234' (Instead of using sequential number as ID, we can use some randomized strings later on). This link will be a webpage which displays the worklist details such as title, creator and the list of articles associated with the worklist and their current status. (These details will be fetched from the database).
  1. To show the latest statuses of articles to the users without having them manually refresh the page, we can use javascript polling to request the server for giving the latest information at certain intervals of time.
  1. For Styling, I am planning to use bootstrap CSS. I will be using python for backend. Also I will be exploring some python framework like Django or Flask during the community bonding period.
  1. If time permits I would be creating metrics or progress reports for the results of the worklist. The basic version could be displaying the total number of articles in a worklist and the number of articles which are successfully completed, in progress, not yet claimed. The stats can be collected by applying some GROUP BY clauses over the articles table.

UI mockups

Screenshot from 2018-03-12 16-49-31.png (322×803 px, 20 KB)

Screenshot from 2018-03-12 18-06-03.png (539×691 px, 16 KB)

Screenshot from 2018-03-12 17-22-14.png (396×680 px, 17 KB)

Mentors: @Surlycyborg,@Harej

Timelines and Deliverables

April 23 - May 10

  • Community bonding period.
  • Design the database schemas and doing some investigation.

May 12 - May 23 (I have my end term exams so I won't be able to contribute during this period)

May 24 - June 10

  • Creating the basic skeleton of the project.
  • Work on creation of worklist. This includes creating the UI for entering worklist details (such as title and petscan query id) and making a create API for this.

June 11 - June 25

  • Work on the worklist details page. This will fetch the list of articles along with their statuses from the database and display it.

June 26 - July 10

  • Providing the feature for claiming the articles and updating their statuses.

July 11 - July 15

  • Implement javascript polling to display the updated information to the users.

July 16 - July 30

  • Implement cron job to refresh the list of articles in a worklist.

August 1 - August 15

  • Wrapping up things done so far.
  • Bug fixes and Documentation.

Participation

I will work hard on this project and bring it to completion. I will be using github to publish my source code. I will be in touch with my mentor through email, IRC. I will be giving frequent updates to my mentor and ask for help when I am stuck.

About Me

I am currently in the sophomore year of B.tech in Computer Science and Engineering at Bundelkhand Institute of Engineering and Technology, Jhansi (UP), India. I heard about this program in a campus session of open source development. Although the odd semester starts in August first week but I will be able to commit enough time for the project as there are no exams during this period.

Past Experience

I had started exploring open source from Feb 2018 and am enjoying it. I have experience in php, javascript, MySql, MATLAB, C++. I have done the following contributions to mediawiki:
T188737: https://gerrit.wikimedia.org/r/#/c/416220/ (Merged)
I have successfully completed my microtask for this project:
https://github.com/Jain-Aditya/gsoc2018-microtask
I have hosted the demo on toolforge. The link is:
https://tools.wmflabs.org/t187305-demo/

Event Timeline

Sorry for the delay, I've left some inline comments on your Google doc. Thanks for submitting this!

AdityaJ updated the task description. (Show Details)

Hey, nice to see this taking shape. Here's a couple more questions / comments based on your proposal and things that came up when I last met with the other mentors.

the link will be generated which can be shared with other users

Do you have any thoughts on how this link would be generated and what it would look like?

creating metrics or progress reports for the results of the worklist.

I'm sure @Sadads will have some metrics in mind, but I'm curious as to whether you do too. What do you think would be interesting to report on a worklist? Can these things be easily reported with the data you're already planning on storing and code you're writing, or would we need more?

Also, on your third UI mock, there's page where worklists from different users can be seen. How does a user arrive at that page? What worklists are featured there? Do we want to make it so some worklists are _not_ displayed?

Do you have any thoughts on how this link would be generated and what it would look like?

The approach which I have in my mind: When we create the worklist, the create API will return the newly created worklist ID, the javascript will simply append that ID to the worklist detail url. It will be something like 'http://worklist-gsoc.org/worklist/1234' . 1234 is the ID returned by the create API.

What do you think would be interesting to report on a worklist?

The basic version could be displaying the total number of articles in a worklist and the number of articles which are successfully completed, in progress, not yet claimed.

How does a user arrive at that page? What worklists are featured there?

This would be the home page for our application. For now, I have thought to display all the worklists created so far (We can put a limit later on, such as showing only the top 10 newly created ones) . Each worklist will be a hyperlink to its details page which is shown in the 2nd UI mockup.
I think different users can view each other's worklists right? Do we have a purpose to show only the loggedin user's worklists on the home page? If yes, we are already storing 'created_by' in the worklist table. It would be easy to filter the worklists for the current loggedin user.

I think different users can view each other's worklists right? Do we have a purpose to show only the loggedin user's worklists on the home page? If yes, we are already storing 'created_by' in the worklist table. It would be easy to filter the worklists for the current loggedin user.

For now I think there isn't really a requirement either way, so I don't have strong opinions on this, but we should keep in mind that it might become a requirement along the way.

Personally, I don't expect users to join other random user's worklists and start working there very often, so I think it would make sense for the worklists to be "pseudo-private" (see my comment below about worklist IDs) and you'd only see links to your own worklists there, but I'll definitely bring it up with the other mentors soon!

1234 is the ID returned by the create API.

Cool, the flow for creating an ID makes sense, but here's a couple of extra ideas about generating the IDs themselves that you may want to think about:

  • Making the IDs random strings rather than sequential numbers to make them harder to guess. This would make worklists "pseudo-private": you need to know a hard-to-guess URL to find them.
  • A nice UX touch would be to use IDs that are random but easy to read for humans, like gfycat.com does: https://gfycat.com/DizzyCourteousErne.

Anyway, none of these are hard requirements at this stage, just some food for thought.

The basic version could be displaying the total number of articles in a worklist and the number of articles which are successfully completed, in progress, not yet claimed.

Yeah, that works for a basic implementation. One suggestion I can think of is that we might want to see how these values progressed over time for a worklist, rather than just the latest values. So, for example, if we have a long-lived worklist that gets used in different events, we can see the impact of each event on it.

What do you think? If it's a good idea, it would be nice to suggest (at a high level) ideas for how to implement that.

Thanks for your suggestions. I have updated the proposal with these details