Page MenuHomePhabricator

Audit abuse filter wikireplica view rules
Closed, ResolvedPublicSecurity

Description

T315407 suggests that the replicas might be leaking information about private filters.

Event Timeline

For the abuse_filter_history issue mentioned in parent task:

(untested)

mmartorana changed the task status from Open to In Progress.Aug 24 2022, 9:36 AM
mmartorana triaged this task as Medium priority.
mmartorana added a project: Vuln-Infoleak.
mmartorana changed Risk Rating from N/A to Medium.

Thank you for your patch. I look forward to applying it and stopping leaking.

Just testing:

select afh_id, afh_filter, afh_user, afh_user_text, afh_timestamp,
afh_pattern, afh_comments, afh_flags, afh_public_comments, afh_actions,
afh_deleted, afh_changed_fields, afh_group
from abuse_filter_history afh
left join abuse_filter af on af.af_id = afh.afh_filter 
where NOT find_in_set("hidden", afh.afh_flags) AND af.af_hidden = 0;
...
and also: where find_in_set("hidden", afh.afh_flags) OR af.af_hidden = 1;

on enwiki, the proposed patch seems correct. Would appreciate if @Daimona or another active AF user/maintainer could double-check the two where clauses for correctness.

sbassett added a project: SecTeam-Processed.

Ok, leaving to Data-Engineering for the maintain-views deploy. I think these are easiest to just push through gerrit with a deploy right after.

Hey @BTullis, this has been sitting here for a while - what can I do to move this forward?

@BTullis @odimitrijevic Given that this is an ongoing privacy leak, could we get some clarity on whether we can get this deployed soon, or how other teams may be able to help if needed?

@BTullis @odimitrijevic Given that this is an ongoing privacy leak, could we get some clarity on whether we can get this deployed soon, or how other teams may be able to help if needed?

Adding also @Gehel to the conversation.

BTullis moved this task from Incoming to In Progress on the Data-Platform-SRE board.

Apologies for the delay in deploying this patch. I will start work on it now.

I have downloaded and applied the patch to a feature branch of puppet. It applies cleanly.

(base) btullis@marlin:~/wmf/puppet$ git apply ../../Downloads/0001-maintain-views-filter-out-abuse_filter_history-for-d.patch
(base) btullis@marlin:~/wmf/puppet$ git diff
diff --git a/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml b/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml
index 8f4f99c962..ea66366b66 100644
--- a/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml
+++ b/modules/profile/templates/wmcs/db/wikireplicas/maintain-views.yaml
@@ -241,14 +241,14 @@ customviews:

     where:
       afl_deleted=0
   abuse_filter_history:
-    source: abuse_filter_history
+    source:
+      - abuse_filter_history
+      - abuse_filter
     view: >
       select afh_id, afh_filter, afh_user, afh_user_text, afh_timestamp,
-      if (find_in_set("hidden",afh_flags),null,afh_pattern) as afh_pattern,
-      if (find_in_set("hidden",afh_flags),null,afh_comments) as afh_comments,
-      afh_flags, afh_public_comments,
-      if (find_in_set("hidden",afh_flags),null,afh_actions) as afh_actions,
+      afh_pattern, afh_comments, afh_flags, afh_public_comments, afh_actions,
       afh_deleted, afh_changed_fields, afh_group
+    where: af_id = afh_filter AND NOT find_in_set("hidden",afh_flags) AND af_hidden = 0
   actor:
     source: actor
     view: select actor_id, actor_user, actor_name

The puppet CR is: https://gerrit.wikimedia.org/r/c/operations/puppet/+/927120

I have added @taavi and @sbassett for review of the puppet patch.

As this change requires an update to existing views across all sections, I believe that it would be safest to depool each of the wikireplica servers whilst the work is carried out.

We have a procedure outlined here: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas#Updating_views although it may need to be updated as it is now possible to depool/pool the entire analytics/web set of wikireplca servers with conftool. This is another potential option, rather that repeatedly modifying and pushing out haproxy configurations.

Merged the patch and deployed it to all replica servers.

I depooled the analytics wikireplica cluster with the following commands:

btullis@puppetmaster1001:~$ sudo -i confctl select "service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet" set/pooled=yes
eqiad/wikireplicas-a/wikireplicas-a/dbproxy1019.eqiad.wmnet: pooled changed no => yes
WARNING:conftool.announce:conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
btullis@puppetmaster1001:~$ sudo -i confctl select "service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet" set/pooled=no
eqiad/wikireplicas-a/wikireplicas-a/dbproxy1018.eqiad.wmnet: pooled changed yes => no
WARNING:conftool.announce:conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet

Now I am about the run the following command on clouddb1017.

sudo maintain-views --all-databases --table abuse_filter_history --replace-all

The command executed successfully and rapidly.

Now I'm checking the abuse_filter_history view on the enwiki_p database.

root@clouddb1017:s1[enwiki_p]> show create view abuse_filter_history\G;
*************************** 1. row ***************************
                View: abuse_filter_history
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `abuse_filter_history` AS select `enwiki`.`abuse_filter_history`.`afh_id` AS `afh_id`,`enwiki`.`abuse_filter_history`.`afh_filter` AS `afh_filter`,`enwiki`.`abuse_filter_history`.`afh_user` AS `afh_user`,`enwiki`.`abuse_filter_history`.`afh_user_text` AS `afh_user_text`,`enwiki`.`abuse_filter_history`.`afh_timestamp` AS `afh_timestamp`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_pattern`) AS `afh_pattern`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_comments`) AS `afh_comments`,`enwiki`.`abuse_filter_history`.`afh_flags` AS `afh_flags`,`enwiki`.`abuse_filter_history`.`afh_public_comments` AS `afh_public_comments`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_actions`) AS `afh_actions`,`enwiki`.`abuse_filter_history`.`afh_deleted` AS `afh_deleted`,`enwiki`.`abuse_filter_history`.`afh_changed_fields` AS `afh_changed_fields`,`enwiki`.`abuse_filter_history`.`afh_group` AS `afh_group` from `enwiki`.`abuse_filter_history`
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.000 sec)

ERROR: No query specified

Hmm. At first glance that doesn't look as if it's correct, but maybe it's an interpretation. I'll compare it with the same statement on a table that hasn't had the script run against it.

This is the output from the show create view abuse_filter_history\G statement on clouddb1013, which is in the web group of wikireplica servers.

root@clouddb1013:s1[enwiki_p]> show create view abuse_filter_history\G;
*************************** 1. row ***************************
                View: abuse_filter_history
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `abuse_filter_history` AS select `enwiki`.`abuse_filter_history`.`afh_id` AS `afh_id`,`enwiki`.`abuse_filter_history`.`afh_filter` AS `afh_filter`,`enwiki`.`abuse_filter_history`.`afh_user` AS `afh_user`,`enwiki`.`abuse_filter_history`.`afh_user_text` AS `afh_user_text`,`enwiki`.`abuse_filter_history`.`afh_timestamp` AS `afh_timestamp`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_pattern`) AS `afh_pattern`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_comments`) AS `afh_comments`,`enwiki`.`abuse_filter_history`.`afh_flags` AS `afh_flags`,`enwiki`.`abuse_filter_history`.`afh_public_comments` AS `afh_public_comments`,if(find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`),NULL,`enwiki`.`abuse_filter_history`.`afh_actions`) AS `afh_actions`,`enwiki`.`abuse_filter_history`.`afh_deleted` AS `afh_deleted`,`enwiki`.`abuse_filter_history`.`afh_changed_fields` AS `afh_changed_fields`,`enwiki`.`abuse_filter_history`.`afh_group` AS `afh_group` from `enwiki`.`abuse_filter_history`
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.000 sec)

ERROR: No query specified

No differences detected, so I must be doing something wrong.

Might need to revert.

The error in T338172 has something to do with the deployment process (depooling the servers), not with the patch itself.

yeah, the depooling was not done successfully (or too successfully depending on your point of view), it's not the the patch itself but the process of deployment that broke it

Sorry about this, I will undo the steps that I took to depool the analytics wikireplicas now.

btullis@puppetmaster1001:~$ sudo -i confctl select "service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet" set/pooled=yes
eqiad/wikireplicas-a/wikireplicas-a/dbproxy1018.eqiad.wmnet: pooled changed no => yes
WARNING:conftool.announce:conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
btullis@puppetmaster1001:~$ sudo -i confctl select "service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet" set/pooled=no
eqiad/wikireplicas-a/wikireplicas-a/dbproxy1019.eqiad.wmnet: pooled changed yes => no
WARNING:conftool.announce:conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet

I still can't get to the wikireplicas from toolforge though.

btullis@tools-sgebastion-10:~$ sql enwiki_p
ERROR 2013 (HY000): Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 11
btullis@tools-sgebastion-10:~$ sql zhwiki_p
ERROR 2013 (HY000): Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 11

...so I'm not sure what needs to happen.

Oho - I have found this open ticket, which seems relevant: T311588: maintain-views table filter not working for custom views on multiple tables

In that ticket the table selection code seems to be having a problem when the view is based on multiple source tables.
That ticket relates to an update of a single view in a single database centralauth_p.localuser whereas in this ticket I'm attempting to update all darabases, however, I think that the issue may be the same.

Continuing to investigate.

OK, I think that @taavi has correctly identified the issue here: T311588#8040308

It seems to be the --table option in the script, which doesn't work when the view contains multiple sources.

I tested with a run of sudo maintain-views --databases enwiki --dry-run --debug and it does come back with:

2023-06-06 14:26:13,969 INFO [enwiki_p.abuse_filter_history] 
2023-06-06 14:26:13,969 DEBUG SQL: 
            CREATE OR REPLACE
            DEFINER=viewmaster
            VIEW `enwiki_p`.`abuse_filter_history`
            AS select afh_id, afh_filter, afh_user, afh_user_text, afh_timestamp, afh_pattern, afh_comments, afh_flags, afh_public_comments, afh_actions, afh_deleted, afh_changed_fields, afh_group

            FROM `enwiki`.`abuse_filter_history`,`enwiki`.`abuse_filter`
         WHERE af_id = afh_filter AND NOT find_in_set("hidden",afh_flags) AND af_hidden = 0
;
View already exists. Replace? [y/N]

So I believe that I can deploy this change by running: sudo maintain-views --all-databases --replace-all

However, it might be better to try to fix the script in a way that was suggested. i.e.

I guess this could be fixed with something like if isinstance(meta["source"], list) and args.table in meta["source"] but that would also make --table globaluser refresh the localuser view and I'm not sure if that's expected or not.

I think that this behaviour would be fine, espeically given that the alternative in this case is to refresh all of the views.

I will check for any other cases where it might have more impact that we assume though, since there are 10 multi-source views at present.

The latest patchset looks like it should work now. I tried a dry-run with the maintain-views.py script from the CR against clouddb1021.

btullis@clouddb1021:~$ sudo ./maintain-views_new-4f818c7ee8112ca2d21538287c007a81.py --database enwiki --table abuse_filter_history --dry-run --debug
2023-06-07 10:40:42,179 DEBUG Removing 0 dbs as sensitive
2023-06-07 10:40:42,190 INFO Full views for enwiki:
2023-06-07 10:40:42,190 INFO Custom views for enwiki:
View already exists. Replace? [y/N] y
2023-06-07 10:40:45,128 INFO [enwiki_p.abuse_filter_history] 
2023-06-07 10:40:45,129 DEBUG SQL: 
            CREATE OR REPLACE
            DEFINER=viewmaster
            VIEW `enwiki_p`.`abuse_filter_history`
            AS select afh_id, afh_filter, afh_user, afh_user_text, afh_timestamp, afh_pattern, afh_comments, afh_flags, afh_public_comments, afh_actions, afh_deleted, afh_changed_fields, afh_group

            FROM `enwiki`.`abuse_filter_history`,`enwiki`.`abuse_filter`
         WHERE af_id = afh_filter AND NOT find_in_set("hidden",afh_flags) AND af_hidden = 0
;

I executed the following command on cloudddb1021

sudo maintain-views --all-databases --table abuse_filter_history --replace-all

This time the table selection worked and for every database where the abuse_filter_history view is present it reported a change. For example:

2023-06-07 11:53:23,082 INFO Custom views for jawiki:
2023-06-07 11:53:23,086 INFO [jawiki_p.abuse_filter_history]

I will now proceed to update the views on the remaining wikireplicas.

Recreating the views on the analytics-wikireplicas now with:

sudo cumin A:wikireplicas-analytics 'maintain-views --all-databases --table abuse_filter_history --replace-all'

Checking the state of enwiki_p

MariaDB [enwiki_p]> show create table abuse_filter_history\G;
*************************** 1. row ***************************
                View: abuse_filter_history
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `abuse_filter_history` AS select `enwiki`.`abuse_filter_history`.`afh_id` AS `afh_id`,`enwiki`.`abuse_filter_history`.`afh_filter` AS `afh_filter`,`enwiki`.`abuse_filter_history`.`afh_user` AS `afh_user`,`enwiki`.`abuse_filter_history`.`afh_user_text` AS `afh_user_text`,`enwiki`.`abuse_filter_history`.`afh_timestamp` AS `afh_timestamp`,`enwiki`.`abuse_filter_history`.`afh_pattern` AS `afh_pattern`,`enwiki`.`abuse_filter_history`.`afh_comments` AS `afh_comments`,`enwiki`.`abuse_filter_history`.`afh_flags` AS `afh_flags`,`enwiki`.`abuse_filter_history`.`afh_public_comments` AS `afh_public_comments`,`enwiki`.`abuse_filter_history`.`afh_actions` AS `afh_actions`,`enwiki`.`abuse_filter_history`.`afh_deleted` AS `afh_deleted`,`enwiki`.`abuse_filter_history`.`afh_changed_fields` AS `afh_changed_fields`,`enwiki`.`abuse_filter_history`.`afh_group` AS `afh_group` from (`enwiki`.`abuse_filter_history` join `enwiki`.`abuse_filter`) where `enwiki`.`abuse_filter`.`af_id` = `enwiki`.`abuse_filter_history`.`afh_filter` and !find_in_set('hidden',`enwiki`.`abuse_filter_history`.`afh_flags`) and `enwiki`.`abuse_filter`.`af_hidden` = 0
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.002 sec)

ERROR: No query specified

MariaDB [enwiki_p]>

Looks good. Proceeding to recreate views on the wikireplicas-web servers.

sudo cumin A:wikireplicas-web 'maintain-views --all-databases --table abuse_filter_history --replace-all'

I believe that this is now complete. If anyone could verify that the fix has been successful, I would be grateful.

I don't think there's any reason to keep this task private? Any objections?

I don't think there's any reason to keep this task private? Any objections?

No objection from me.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Jun 8 2023, 1:57 PM
sbassett changed the edit policy from "Custom Policy" to "All Users".