To implement the new data retention policy, we need to make sure we have no
pieces of private data lying around on various systems currently.
We should:
Make an exhaustive inventory of (private) data on all systems - partially
scripted perhaps
For data that is just sitting there, e.g. one-off tarballs that are not
otherwise managed by an automatic process, we should then either remove those
backups entirely, or clean out private data from them.
For data that is managed by automatic processes (e.g. backup system, log
rotate, etc.), we should ensure that the data retention policy is being
followed and data will automatically be removed within 5 years.
And probably we should reevaluate this periodicatlly, say at least once a year,
to ensure we're still maintaining this standard.
--
Mark Bergsma <mark at wikimedia>
Lead Operations Architect
Wikimedia Foundation
Description
Description
Details
Details
- Reference
- rt6855
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T83531 Implement Data Retention Guidelines | |||
Invalid | None | T83522 Make inventory of (private) data backups on all systems |