Page MenuHomePhabricator

Determine suitable data retention for Scrive
Closed, ResolvedPublic

Description

In order to finalise the Privacy Policy implications of using Scrive we need to:

  • Determine how long we keep various documents on Scrive.com (as opposed to downloading and storing them).
  • Which documents we download and store, and with what frequency this is done.

The types of documents that Scrive lists are:

  • Drafts
  • Signed documents
  • Recalled documents
  • Expired documents [sent but never signed]
  • Rejected documents
  • Broken documents [I presume this is incorrectly set up documents that were still sent?, alternatively as subcategory of drafts]

For each of this you can set the number of days the data is retained before being moved to the trash can.
Since the trash can is not the same as an actual deletion there is also an option to:

  • Empty the trash can every 24 hours
  • How long to keep documents in the trash can before they are deleted permanently

Event Timeline

@Jopparn @Jenny_Brandt_WMSE It would be good to find a time to chat about this to since it in part depends on how you have been using Scrive and how the offboarding of documents has worked so far.

@Jopparn @Jenny_Brandt_WMSE I've set a meeting date on 22 June. Details are in the invite

Admins can access all sent documents (incl. recalled ones) with the exemption of Drafts. These should not be an issue since drafts need need to be retained.

Per meeting discussions 2022-06-22

Difference between Scrive copy and locally saved copies:

  • For unsigned documents (or ones only partially signed ones) the log of events is not included in the pdf nor are any partial signatures. To document the chain of events itself (if necessary) the log page needs to be printed out separately.
  • For signed documents the signatures (and related evidence) are incorporated into the pdf of the contract itself. The list of events (and related evidence appendixes, see below) are however not included and will be lost once the document is deleted from the platform.

Screenshot from 2022-06-27 08-22-36.png (331×911 px, 37 KB)

Document type analysis and recommended retention:

  • Drafts:
    • Only need to be saved long enough to ensure they are not actively worked on.
    • Recommended retention 60 days .
    • Doesn't need to be saved locally
  • Recalled documents
    • Don't need to be saved once recalled. But might be polite to send a separate e-mail to the recipient since they do not get any info about the document haing been recalled. (link just dies)
    • Recommended retention: 10 days
    • Doesn't need to be saved locally
  • Expired documents [sent but never signed]
    • An expired document can be extended (or restarted as a new document). So there is an interest in keeping it long enough that it could be extended if needed. In case some recipients have signed it but others haven't the online copy is their only access to the document.
    • Recommended retention: 60 days
    • There might be documents where the expiration status is important to document in case of disputes.
  • Rejected documents
    • There is an interest in keeping these for long enough that the sender notices that it was rejected. The rejected document can also be used to restart a signing process.
    • Recommended retention: 60 days
    • There might be documents where the rejection status is important to document in case of disputes.
  • Broken documents
    • There is an interest in keeping these for long enough that the sender notices that it was broken.
    • Recommended retention: 60 days
  • Signed documents
    • The most important ones. Therefore important that the don’t get purged before being locally stored. But the recipient has a signed copy (attached in response e-mail) so no need to keep it online for their sake.
    • Recommended retention: 90 days

Any of these documents can be deleted by the sender whenever they wish since the recipient gets an attached copy after the signing stage.

Recommended routine:
The organisational assistant checks on the status of the documents bi-weekly then they can:

  • Inform the sender of any expired documents
    • The sender is responsible for acting on these by either extending them or issuing a new document or asking the organisational assistant to store the fact that the document expired without a signature.
    • The log is not included in the pdf when it isn’t signed. So a pdf must be created of the log itself.
  • Inform the sender of any rejected documents
    • The sender is responsible for acting on these by e.g. issuing a new document or asking the organisational assistant to store the fact that the document expired without a signature.
    • The log is not included in the pdf when it isn’t signed. So a pdf must be created of the log itself.
  • Inform the sender of any broken documents
    • The sender is responsible for acting on these
  • Signed documents are saved in Drive
    • In the case of personnel files the ED/COO are requested to save them due to access restrictions on the target directories in Drive.
    • It is yet to be decided of these should then be deleted manually from Scrive to make it easier to see which have been handled at the cost of loosing the events log.

@Jopparn any objections around this+ Otherwise @Jenny_Brandt_WMSE can move forwards with handling the documents in Scrive today and I'll update our notes around data retention

Everything sounds resonable to me. Please go ahead.

Suggested texts in this document

Asked @Historiker to take a look

@Historiker has had a look and considers the suggestions to be ok.

Great! Please move ahead to finalize this.

An update on this. When almost being ready to implement this we discovered that deleting the documents from the Scrive platform also means you can no longer use the links in the contracts to validate that it is an actual signed copy (and not a cut and pasted pdf). This therefore increases the need for retaining fully signed documents in Scrive.

The suggested rules for the other document types can be kept , but for Signed documents we probably want to subdivide these into categories which can be kept only 90-days and which have to be kept a while longer (because they e.g. cover stil ongoing activities) and if any documents need to be retained indefinitely.

To get this ball moving again. I'm proposing a new setup whereby signed documents get saved permanently (to ensure they remain fully valid). Work on narrowing this down is handled in T341867: Create basic categories for Scrive documents and document how each of these should be archived .

I'll contact Scrive to set up the automated pruning of the non-signed documents.

I'll contact Scrive to set up the automated pruning of the non-signed documents.

This was requested 2023-07-14

I sent in an order form for the Data rentention policy today.

Note that there was some additional information about whether deleted documents are still verifiable. I'm requesting some additional clarification around this.

Note that even if its possible to delete signed documents without affecting their validity we would first need to offboard all currently stored documents before we could extend the data retention policy to these documents.

Data retention plan has now been activated. It affects all non-signed documents.

I have now received more information about verification of sealed documents (in the bottom of this document).
TL:DR:

  1. We don't need to store signed and sealed documents in the E-archive for them to be verifiable
  2. Storing them in the E-archive for an initial 40 days gives added verifiability possibilities
  3. The documents sent out as e-mail attachments may be modified by anti-virus scanners or e-mail clients at which points they stop being verifiable. It is therefore always best to download the file straight of Scrive or test that the file verifies before storing it locally.
  4. We have a backlog of signed documents (oldest is 2020-07-02) that we need to manually offboard (see T341867 for where) before we can activate a retention plan for signed documents
  5. We need to have a fully established routine for following up on documents before we can activate a retention plan for signed documents.

Based on the above I'm moving forward with the following:

  • Set up a follow up task for activating a retention plan for signed documents once the above blockers are cleared. - T342522
  • Suggesting we set the retention time for signed documents in our internal policy to max 1 year (T256962#9037010) and that this pruning is done manually.
    • 1 year is chosen for practical reasons. It's reasonable to think we can get the backlog down to that by October and it gives us time to ensure routines are working before starting to affect non-backlog documents
    • 365 days is the longest time that is possible to set in a Scrive data retention policy, so once we are ready to activate that we should already be in compliance and can then have a think if we want to set it shorter.

If I understand you, and the information page, correctly all sealed documents are verifiable (via https://scrive.com/verify or https://verifier.guardtime.com/) independently of if they are kept in the E-archive and independently of how long time has passed since signing.

Correct, they are verifiable through Guardtime now matter how long time has passed, as long as the SIgned and Sealed copy is not tampered with (modified) in any way, with will of course break the seal/integrity of the file, and any future verification will fail. However, the Scrive e-archive will contain the original, pure, untampered sealed document indefinately, or until it is manually deleted.

Keeping the document in the E-archive for an initial 40 days grants the keyless seal but does otherwise not affect the ability of verifying the documents using https://scrive.com/verify or https://verifier.guardtime.com/. Is this understanding correct?

Correct, adding the Keyless Digital Signature will not affect Guardtime sealing/verification.
You can further also check our our Help Center on this topic on this page.

I also wanted to check if there is some mechanism in the verification process which have changed in the last 2,5 years. The reason I'm wondering is that when I try to verify a document from early 2021 using the file that was e-mailed out from Scrive back then it fails to verify. But if I download the same file fresh off the Scrive E-archive then it verifies. The two files are visually identical.

There should have been no such change, and usually, this "phenomenon" is simply caused by the document (pdf) having been tampared with/modified in any way. Note that e.g. anti-virus software, archiving software etc. etc. may sometimes change/modify files, and this might occur (file then not being verifiable).
You can manually check the Attachments in the PDF-file, to confirm that the Verification and Evidence Appendices are all there, to confirm that your stored file is indeed correct, however, any "invisible" changed that changes the file will of course still break the seal.
Note from André: The document in question was attached to an e-mail which was forwarded to me by @Jopparn (2021-01-07). That version failed to verify. Comparing that file to the downloaded one using something like diffpdf shows them as identical but they do have different checksums (using e.g. sha256sum)

Lokal_Profil claimed this task.
Lokal_Profil moved this task from Soon to Done on the User-Jenny_Brandt_WMSE board.

Suggested routine was accepted