Page MenuHomePhabricator

Wikipedias Accuracy Review Meeting 3 : 11th June 2016 UTC 4am
Closed, ResolvedPublic

Description

Date and Time

  • 11th June 2016 - UTC 4am

Format

  • Google Chat
  • Wikimedia etherpad

Agenda

  1. Discuss and resolve database issues
  2. Format of the review system
  3. License for the app

Meeting Minutes

1. Discuss and resolve database issues

  • The error I was getting was: "No support for ALTER of constraints in SQLite dialect"); NotImplementedError: No support for ALTER of constraints in SQLite dialect
  • Solution on a forum was to switch over to SQL/Postgres since not being able to alter columns in a table is a limitation of SQLite
  • Suggestions: For now, maybe rename the table and rebuild it as per this answer.
  • More discussion on db format later

2. Format of the review system

  • suggest review items to the reviewer, so that they agree/disagree with it. Send their review to a second reviewer along with comments. If both the reviewers share a different opinion on the item, send it to a third tie-breaking reviewer, after which accepted review items are published on an external page (Labs?).
  • Also show a list of completed assignments for each reviewer on his/her profile
  • Consider whether we need to have an option for the reviewer to change their mind later on

3. License for the app

  • Apache 2.0 is the best so far
  • But will also consider BSD, MIT, Artistic, or other Open Source Consortium-approved licenses if necessary
  • GPL and LGPL involve difficulties that will keep some people away, they are not particularly more useful, but if you depend on GPL software you should keep using it.

Also looked through the working of the DataFlowBot that outputs low quality articles having high popularity. It basically combines ORES predictions of STUB articles with article view count, as well as factors such as social impact, etc. It's coded in PHP (GitHub page).

Event Timeline

@prnk28,

How do you feel about switching to NoSQL? I can see that will solve a huge number of your schema problems. Please let me know your thoughts on http://stackoverflow.com/a/2581859

Please reply here with commented INSERT TABLE statements for the SQL schema you have so far. Moving to NoSQL will solve your schema upgrade problem. I will help you refactor to NoSQL. I promise I will put in as many hours as that takes this week and/or next.

We need to upgrade password hashing from SHA-1 to SHA-3. SHA-2 is pretty bad, and SHA-1 is worse.

I would like to minimize the number of packages we depend on. What are you using in flask that you can't get out of just werkzeug so far?

Let's not have commenters or moderators, please, just admins and reviewers. We can use ordinary wikis to comment on specific review items if they have permalinks, and it is better to have comments on an external site anyway. The reputation system should serve as the sole moderator.

Does HTML display correctly in the operators' "about" fields?

While I am working on refactoring, you can use those links about Labs and create an account there if you want.

All ok?

Best regards,
@Jsalsman

P.S. @prnk28 and @FaFlo I have updated https://etherpad.wikimedia.org/p/accuracyreview from around line 120 at present; please review there.

@prnk28 AHA! I found the perfect NoSQL database for this app: Vlermv http://pythonhosted.org/vlermv/

Please disregard my earlier recommendation of PickleDB: it has no concurrency.

Please let me know your thoughts.

@prnk28 one more thing: I was wrong about SHA-3. SHA-2 (in particular, SHA-512 such as in https://www.npmjs.com/package/krypto ) is actually still better. How do you feel about doing the password hash in the browser if possible (i.e., if JavaScript is enabled) and only sending that in the login form POST, but otherwise sending the plaintext password and using Python's hashlib for SHA-512?

How do you feel about switching to NoSQL? I can see that will solve a huge number of your schema problems. Please let me know your thoughts on http://stackoverflow.com/a/2581859

Which format of NoSQL would be suitable? Document, columnar, key-value or graph type? And why?
I was going over the use cases of SQL and NoSQL. Apparently, NoSQL is really helpful for scaling out and if we have a lot of data. SQL on the other hand provides more consistency and makes it easier to carry out indexed queries. Please see http://stackoverflow.com/questions/2559411/sql-mysql-vs-nosql-couchdb

Please reply here with commented INSERT TABLE statements for the SQL schema you have so far. Moving to NoSQL will solve your schema upgrade problem. I will help you refactor to NoSQL. I promise I will put in as many hours as that takes this week and/or next.

As I had mentioned earlier, since I'm using SQLAlchemy which implements an ORM appraoch for dbm, I don't need to specifically write SQL statements. The class definitions in app/models.py take care of table creation. Nevertheless I have written the SQL equivalent below for the two tables.

CREATE TABLE roles
(
id INT,
name VARCHAR (64) UNIQUE,
defaultpermission BOOLEAN DEFAULT False,
permissions INT,
PRIMARY KEY (id)
);

CREATE TABLE reviewers
(
id INT,
email VARCHAR(64) UNIQUE,
username VARCHAR(64) UNIQUE,
password_hash VARCHAR(128),
role_id INT,
confirmed BOOLEAN,
agreement FLOAT,
reputation FLOAT,
PRIMARY KEY (id),
FOREIGN KEY (role_id) REFERENCES roles(id)
);

We need to upgrade password hashing from SHA-1 to SHA-3. SHA-2 is pretty bad, and SHA-1 is worse.

I would like to minimize the number of packages we depend on. What are you using in flask that you can't get out of just werkzeug so far?

The complete list of requirements can be viewed here. The 'itsdangerous' package is used only for generating tokens for account confirmation and nothing else. All it does is to map the userid to a token that can be appended to a confirmation URL which expires in a certain period of time.

Let's not have commenters or moderators, please, just admins and reviewers. We can use ordinary wikis to comment on specific review items if they have permalinks, and it is better to have comments on an external site anyway. The reputation system should serve as the sole moderator.

There are no moderators in the system. The 'comment' that had been mentioned is just a permissions and NOT a role. Admins and reviewers have the permission to 'comment'. I can remove the 'comment' permission altogether of you feel it's redundant. It wouldn't make much of a difference to the app.

Does HTML display correctly in the operators' "about" fields?

Yes. It displays in text format, if that's what you mean.

While I am working on refactoring, you can use those links about Labs and create an account there if you want.

On it.

Also, I 'm creating a new task for the refactoring part. The meetings task may not be the right place for code related discussions :P