Page MenuHomePhabricator

Set up x1 replication to Wiki Replicas
Open, Stalled, MediumPublic

Description

As discussed in T395072: Add "wikishared" database to wiki replicas. This is also a blocker for T387419: Create wiki replicas views for globaljsonlinks tables.

Two reasons why we want this (there may be more):

  1. DBAs would like to move some data to x1, but if they do Wiki Replicas will lose access to it (see comment from @Ladsgroup below)
  2. Wiki Replicas users expect that all data from production DBs is available in Wiki Replicas, unless there are privacy concerns.

@Ladsgroup wrote in T395072#10851754:

We definitely should do that which helps us DBAs argue for moving data out of core sections that are pressured for capacity to x1. For example, I want to move wbc_entity_usage tables out of core dbs but x1 not replicating to the cloud blocks the move.

And I emphaisze that I agree with Manuel, privacy subteam of security must sign off on which tables can be replicated and which can't (and what views are needed) before we can make any move

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This definitely needs a review from security before we can proceed.
This will also likely need specific triggers/views.

But first, we need to get an indication on:

  • tables that can be replicated entirely as they are
  • tables tha cannot be replicated at all
  • tables that need some redactions
  • tables that require some specific views

we need to get an indication on

Who is best placed to provide the answers here? I'm looping in Data-Engineering as well.

we need to get an indication on

Who is best placed to provide the answers here? I'm looping in Data-Engineering as well.

Privacy subteam of security.

@fnegri can you remind me what initiative this supports? Better yet add the context to the description. If there are relevant deadlines or plans, please include those as well. Thank you!

fnegri updated the task description. (Show Details)

@VirginiaPoundstone I updated the task description with more context. I will also ask in the user-created parent tasks if they have specific use cases why they would like to have this section, or if it's just for the sake of completeness.

A use case for having x1 in Wiki Replicas was discussed in T387419: Create wiki replicas views for globaljsonlinks tables: being able to "generate statistics and usage information about charts". The tables used by the Chart extension are in the x1 section.

I created T407485 to track the work required to add this section to an-redacteddb1001 and set up the initial replication.

I'm conscious that @Marostegui has said this, back in June.

This definitely needs a review from security before we can proceed.
This will also likely need specific triggers/views.

But first, we need to get an indication on:

  • tables that can be replicated entirely as they are
  • tables that cannot be replicated at all
  • tables that need some redactions
  • tables that require some specific views

When you say:

tables that cannot be replicated at all

...do you mean that certain table might be entirely filtered out of the MariaDB replication so that it never leaves the upstream x1 replica?
Or do you just mean that there is no view of these tables defined at all?

Is it safe for me to start defining a puppet resource for this x1 section on an-redacteddb1001 now, or do we still need to wait for a privacy/security review before taking any action?

It's worth noting that because of the table catalog (T363581), now all tables that should be fully filtered or are fully public can be easily determined and already will be filtered when someone sets up the replication (We did catalog x1 tables too: T399302). That leaves only updating maintain-views for partially public tables.

I created T407485 to track the work required to add this section to an-redacteddb1001 and set up the initial replication.

I'm conscious that @Marostegui has said this, back in June.

This definitely needs a review from security before we can proceed.
This will also likely need specific triggers/views.

But first, we need to get an indication on:

  • tables that can be replicated entirely as they are
  • tables that cannot be replicated at all
  • tables that need some redactions
  • tables that require some specific views

When you say:

tables that cannot be replicated at all

...do you mean that certain table might be entirely filtered out of the MariaDB replication so that it never leaves the upstream x1 replica?

Correct

Or do you just mean that there is no view of these tables defined at all?

We've had both cases, tables that we may be okay at some point and tables that we are 100% sure will never be allowed to have views on and hence we even exclude them in the replication thread.

Is it safe for me to start defining a puppet resource for this x1 section on an-redacteddb1001 now, or do we still need to wait for a privacy/security review before taking any action?

I think we can start puppetizing this, but we definitely need a security review before proceeding with any data movement.