Page MenuHomePhabricator

Expose ar_content_format and ar_content_model columns of archive table on Labs replicas
Closed, ResolvedPublic

Description

The columns ar_content_format and ar_content_model of table archive should also be selectable on ToolLabs in the same way as rev_content_model and rev_content_format on table revision.

Event Timeline

Umherirrender raised the priority of this task from to Needs Triage.
Umherirrender updated the task description. (Show Details)
Umherirrender subscribed.
scfc triaged this task as Medium priority.Apr 6 2015, 11:31 AM
scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.
scfc subscribed.
bd808 renamed this task from Make ar_content_format and ar_content_model available on ToolLabs to Expose ar_content_format and ar_content_model columns of archive table on Labs replicas.Mar 26 2017, 8:33 PM
bd808 edited projects, added DBA; removed Cloud-VPS, Toolforge.
bd808 moved this task from Triage to Database on the Cloud-Services board.

These two columns are replicated in labs, but missing from the view.

mysql:root@localhost [enwiki]> select @@hostname;
+------------+
| @@hostname |
+------------+
| labsdb1009 |
+------------+
1 row in set (0.00 sec)

mysql:root@localhost [enwiki]> desc archive;
+-------------------+---------------------+------+-----+---------+----------------+
| Field             | Type                | Null | Key | Default | Extra          |
+-------------------+---------------------+------+-----+---------+----------------+
| ar_id             | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| ar_namespace      | int(11)             | NO   | MUL | 0       |                |
| ar_title          | varbinary(255)      | NO   |     |         |                |
| ar_text           | mediumblob          | NO   |     | NULL    |                |
| ar_comment        | tinyblob            | NO   |     | NULL    |                |
| ar_user           | int(5) unsigned     | NO   |     | 0       |                |
| ar_user_text      | varbinary(255)      | NO   | MUL |         |                |
| ar_timestamp      | varbinary(14)       | NO   |     |         |                |
| ar_minor_edit     | tinyint(1)          | NO   |     | 0       |                |
| ar_flags          | tinyblob            | NO   |     | NULL    |                |
| ar_rev_id         | int(8) unsigned     | YES  | MUL | NULL    |                |
| ar_text_id        | int(8) unsigned     | YES  |     | NULL    |                |
| ar_deleted        | tinyint(1) unsigned | NO   |     | 0       |                |
| ar_len            | int(8) unsigned     | YES  |     | NULL    |                |
| ar_page_id        | int(10) unsigned    | YES  |     | NULL    |                |
| ar_parent_id      | int(10) unsigned    | YES  |     | NULL    |                |
| ar_sha1           | varbinary(32)       | NO   |     |         |                |
| ar_content_model  | varbinary(32)       | YES  |     | NULL    |                |
| ar_content_format | varbinary(64)       | YES  |     | NULL    |                |
+-------------------+---------------------+------+-----+---------+----------------+
19 rows in set (0.00 sec)

mysql:root@localhost [enwiki]> use enwiki_p
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql:root@localhost [enwiki_p]> desc archive;
+---------------+---------------------+------+-----+---------+-------+
| Field         | Type                | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------+-------+
| ar_id         | int(10) unsigned    | NO   |     | 0       |       |
| ar_namespace  | int(11)             | NO   |     | 0       |       |
| ar_title      | varbinary(255)      | NO   |     |         |       |
| ar_text       | binary(0)           | YES  |     | NULL    |       |
| ar_comment    | binary(0)           | YES  |     | NULL    |       |
| ar_user       | bigint(10) unsigned | YES  |     | NULL    |       |
| ar_user_text  | varbinary(255)      | YES  |     | NULL    |       |
| ar_timestamp  | varbinary(14)       | NO   |     |         |       |
| ar_minor_edit | tinyint(1)          | NO   |     | 0       |       |
| ar_flags      | tinyblob            | NO   |     | NULL    |       |
| ar_rev_id     | int(8) unsigned     | YES  |     | NULL    |       |
| ar_text_id    | bigint(10) unsigned | YES  |     | NULL    |       |
| ar_deleted    | tinyint(1) unsigned | NO   |     | 0       |       |
| ar_len        | bigint(10) unsigned | YES  |     | NULL    |       |
| ar_page_id    | int(10) unsigned    | YES  |     | NULL    |       |
| ar_parent_id  | int(10) unsigned    | YES  |     | NULL    |       |
| ar_sha1       | varbinary(32)       | YES  |     | NULL    |       |
+---------------+---------------------+------+-----+---------+-------+
17 rows in set (0.00 sec)

Change 363851 had a related patch set uploaded (by Umherirrender; owner: Umherirrender):
[operations/puppet@production] Add ar_content_format and ar_content_model to labs views

https://gerrit.wikimedia.org/r/363851

Change 363851 had a related patch set uploaded (by Umherirrender; owner: Umherirrender):
[operations/puppet@production] Add ar_content_format and ar_content_model to labs views

https://gerrit.wikimedia.org/r/363851

@Bawolff, security review of this addition? :)

I'm fine with it, provided that its null when ar_deleted&1=1.

This may be mildly paranoia, but I'd like to be strict as possible on treatment of revision deletion, and there should be no negative side effects since that field is not indexed anyways.

I personally am not fond of the archive table being available in general, but that's a separate discussion, however I do want to say that I may at some later date try and convince people not to expose the archive table at all, or only have a summary table of it. However, as long as archive is exposed, I'm fine with ar_conent_model/format being exposed if they are null when rev_deleted&1.

APalmer_WMF reopened this task as Open.
APalmer_WMF added a subscriber: ZhouZ.

Adding wmf-legal as suggested on the patch set

Speaking for legal, we clear this change as well.

Change 363851 merged by Rush:
[operations/puppet@production] Add ar_content_format and ar_content_model to labs views

https://gerrit.wikimedia.org/r/363851

1978Gage2001 moved this task from Triage to In progress on the DBA board.

merged but not active

What still needs to happen in this task to get it resolved? (Also wondering if the task assignee @Umherirrender is still correct for the rest to be done?)

That'd be me. The views need to be rebuilt via script.

Bstorm moved this task from Inbox to Clinic Duty on the cloud-services-team (Kanban) board.

Wait, there's no way that isn't active at this point.

This is definitely exposed on replicas, however it appears like it is actually NULL'd upstream anyway. I'm closing this.