[RFC] Server side Design (API/Database schema)
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Hadyelsahar
	Nov 28 2019, 7:38 PM

Description

This page is to discuss the server-side design of scribe tool with respect to database schema and API calls.

Related Objects

Mentioned In: T239576: Update server database model with recent changes

Event Timeline

Hadyelsahar created this task.Nov 28 2019, 7:38 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 28 2019, 7:38 PM

Hi @Hadyelsahar. The current non-existing task description makes is hard for others to help or contribute, and for a triager/tester to figure out at some point in the future whether there is still a valid task. Please consider checking the recommendations on how to write a clear task. Thanks!

Hey @Aklapper, this is just a placeholder we will fill in the details later today.

Hadyelsahar assigned this task to Lucie.Nov 29 2019, 9:58 AM

Hadyelsahar updated the task description. (Show Details)

The current schema we have has the following:

table_name: edit

Field	Type	Null	Key	Default	Extra
id	int(11)	NO	PRI	NULL	auto_increment
article_name	text	NO		NULL
section_label	varchar(200)	YES		NULL
content	text	YES		NULL
url	text	YES		NULL
domain	varchar(100)	YES		NULL
lang_code	varchar(3)	YES		NULL
date	date	NO		NULL

table_name: section

Field	Type	Null	Key	Default	Extra
id	int(11)	NO	PRI	NULL	auto_increment
label	varchar(200)	NO	UNI	NULL
article_name	text	YES		NULL
lang_code	varchar(3)	YES		NULL

table_name: article

id	int(11)	NO	PRI	NULL	auto_increment
name	text	NO		NULL
wd_q_id	varchar(20)	YES	UNI	NULL
lang_code	varchar(3)	YES		NULL

@Hadyelsahar @Lucie This is the rough structure used at the moment. Could we add Wd_QID column to the section(If it will help with getting translations)?

Hadyelsahar added a comment.Dec 1 2019, 4:58 PM

This comment was removed by Hadyelsahar.

I discussed with Hady today, and we would suggest the following changes:

General the language code can be longer than three characters, e.g. zh-hans.

Article

id	int(11)	NO	PRI	NULL	auto_inctement
name	text	NO		NULL
wd_q_id	varchar(20)	NO	UNI	NULL
lang_code	varchar(6)	NO		NULL
domain	text	YES		NULL
tag	text	YES		NULL
retrieved_date	date	NO		NULL

Section

id	int(11)	NO	PRI	NULL	auto_inctement
article_id	int(11)	NO		NULL
name	text	NO		NULL
order_number	int	NO		NULL
content_selection_method	text	YES		NULL
retrieved_date	date	NO		NULL

Reference

id	int(11)	NO	PRI	NULL	auto_inctement
article_id	int(11)	NO		NULL
section_id	int(11)	YES		NULL
url	text	NO		NULL
publisher_name	text	YES		NULL
publication_title	text	YES		NULL
summary	text	YES		NULL
publication_date	date	YES		NULL
retrieved_date	date	NO		NULL
content_selection_method	text	YES		NULL

Statistics

id	int(11)	NO	PRI	NULL	auto_inctement
article_id	int(11)	NO		NULL
created	boolean	NO		NULL
sandbox	boolean	NO		NULL
timestamp	date	NO		NULL
references_used	array of reference_ids	NO		NULL
sections_used	array of section_ids	NO		NULL
mobile	boolean	NO		NULL

I would suggest the amendments for the following reasons:
Article

we will start with articles of domains, so it will be good to keep track of the domains we categorized the articles in. We also plan to create a whitelist of references, which makes it easier to match them.
the tag can be a place to keep track when and how we decided to add the article, or e.g. if we just used Wikidata to retrieve missing articles
retrieved_date lets us keep track of when we added this entry to the database (the same for the following tables)

Section

article_id instead of article_name makes sure we use ids as foreign keys across tables, rather than text which can be ambiguous (especially as Wikidata labels don't have to be unique)
order_number helps to decide which order the sections should be displayed in the article
content_selection method lets us keep track of how we created those section titles and be able to add new content selection types and e.g. test them

Reference

This table should collect all references we suggest for the article, therefore I changed the name from edit (if I understood your edit table correctly)
All other fields are based on the content we need for references (such as publication_title and its date), the summary field can be multiple sentences, summarizing the reference
The section field can be a number if the reference summary relates for a certain section, or NULL if the reference can be used across sections

Statistics

If possible, we should write to this table if someone decides to publish an article created using Scribe so we can keep track of what works well and what doesn't.
references_used and sections_used should be a list of all references and sections used from the ones suggested respectively (with that we can also track which creation methods work best)
we should also track whether people publish directly as an article or in their sandbox (given we provide them with those options)
mobile should track if the article was edited and created on mobile or desktop (given we can access that information)

Eugene233 mentioned this in T239576: Update server database model with recent changes.Dec 2 2019, 9:31 AM

Frimelle moved this task from Doing to Done on the Scribe board.Dec 11 2019, 3:39 PM

@Lucie / @Frimelle: Hi! This task has been assigned to you a while ago. Could you maybe share an update? Do you still plan to work on this task?

If this task has been resolved in the meantime: Please update the task status (via Add Action... → Change Status in the dropdown menu).
If this task is not resolved and only if you do not plan to work on this task anymore: Please consider removing yourself as assignee (via Add Action... → Assign / Claim in the dropdown menu): That would allow others to work on this (in theory), as others won't think that someone is already working on this. Thanks! :)

Removing task assignee due to inactivity, as this open task has been assigned for more than two years. See the email sent to the task assignee on February 06th 2022 (and T295729).

Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.

If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".

Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.

Boldy closing as resolved as this task is in the Done workboard column

[RFC] Server side Design (API/Database schema)Closed, ResolvedPublicActions

Description

Related Objects

Event Timeline

[RFC] Server side Design (API/Database schema)
Closed, ResolvedPublic
Actions