See T120738 for details.
**Name**: Sigma WP
**Email**: sigmawp@gmail.com
**IRC**: freenode as SigmaWP
**Web Page**: N/A
**Resume**: Censored edition available on request
**Location**: the California Republic
**Typical working hours**: Afternoon on weekdays
**Title**: Article author attribution: on a Generalised Sequence of Contributors (GSoC)
**Abstract**:
Wikipedia is an online encyclopedia that ranks among the most visited websites in the world. The text of its articles is licensed under the CC BY-SA and the GFDL. Both licenses require attribution of the material to all the authors whenever content is copied to other wikis or even outside wikis, eg to PDFs. In all cases, it is necessary to attribute the material to all authors. There is no currently existing method to get a list of all authors, outside of exporting the relevant page as a PDF and then copying the list from there. But even then, there is no control over sorting or filtering the list, or even viewing any other details about the list. We introduce this feature into MediaWiki.
**
Possible mentors**: Addshore, Niharika
**Details**:
As the question concerns the very concept of transferring text from a wiki, it would therefore be most fitting for this to be either in core or available as an extension. There exists already an extension for this (Extension:Contributors). There exists also a feature built in to core that is disabled on WMF wikis, action=credits. However, the way that this is currently implemented does not scale (chiefly a lack of caching), hence the disabled on WMF wikis part.
We propose two new tables in the database. One will be called “contributors”, containing id, page_id, user_id, user_text, is_author (boolean), revision_count (int). The other will be called “contributors_props”, containing id, prop_name, prop_value (cf the already-existing page props table).
This puts us in a position such that the current version of Extension:Contributors as well as action=credits will be able to work off of the contributors table. This table can either be updated sync or async. There is no expected need to recalculate everything for a certain page or make any significant database queries. Additional columns may have to be added, but that can be discussed during the actual GSoC process.
The second table will be a more extendable one, used for other non-essential details, eg whitespace change, characters copied and pasted, etc.
**Schedule**:
Format is: Week starting on this day: Stuff to do
23 May: figure out the best database schema to organise the data in
30 May: write docs and create mock-ups
7 June: ^
13 June: Write basic backend and backend function tests
20 June: ^
27 June: Work on functions and UI, integrate with the current extension, etc
4 July: work on other features, eg sorting by a certain field or displaying miscellaneous info about users
11 July: ^
18 July: unit tests to address these modifications
25 July: update docs
1 August: tidy up loose ends and make improvements based on code review
8 August to 23 August: wrap everything up
**About**:
I have been active on the English Wikipedia since 2011 and since then I’ve written three bots and I maintain several tools on Labs ( https://tools.wmflabs.org/sigma/ ).
I am motivated to work on this project because as a Wikipedian, I know that the free movement of content is an essential task, so making it easier will have tangible benefits. As a programmer, I also feel that the project is a good match for my skills. My github is privately available on request.