Page MenuHomePhabricator

Consider CiviCRM Log Export extension
Open, Needs TriagePublic

Description

https://github.com/australiangreens/au.org.greens.logexport

Does this cover all the kinds of exports that we would be interested in keeping an eye on?

Use case: Sometimes during staff departures we want to double check that no unauthorized export of donor data has occurred by that user.

Event Timeline

AKanji-WMF added subscribers: Dwisehaupt, AKanji-WMF.

This looks valuable, what would be the lift to install and maintain? @Dwisehaupt is there anything that we currently use to monitor and log exports? Are there implications to storage and anything around data retention we need to consider?

@AKanji-WMF Easy to install and maintain, if we use it as is, this is a very simple extension, really just 7 lines of code. We would need to think about if our current log retention policies would cover our needs here, but I suspect they would. The extension includes all the ids in the log file, which could be large in our case. We could change it to give an option to only include a row count, as we probably don't care about the specific ids included in the export, either with a setting in the extension or just by implementing the hook in wmf_civicrm ourselves instead. Even with ids, we don't consider that PII, so not an issue from that angle.

However, this only covers traditional exports and not downloading data from SearchKit. We could set this up now to start, but we'd want to cover those SK exports as well in the end. To do that, we'd need to add a hook to core for SK downloads (likely around here) and then add support for SK downloads to the extension. Still not a huge lift, but a little bit more work.

@AKanji-WMF We have reviewed access logs including the request string and sizing but it is hampered by many requests being POSTs which don't have fine grained detail. It is harder to do this post request analysis and something within the application would be best.

As for sizing, that depends on the the volume of requests logged, approximate average size, and timeframe for retention. Tough to estimate without this info but if we could get some samples over a day/week that would give us something to work off of.

@greg Would we want a to store a list of all the entity ids exported (which could potentially be very long log lines) or would just the number of entities exported be enough?

hmmm, thinking about this from the lens of the use case I put in the description, I'm not sure!

Is there a way to selectively log the entity ids if the export is smaller than some size? My thinking is: big exports happening in that use case are by nature deserving of investigation (someone calling them to find out why it was done) but smaller exports of more sensitive donors is also deserving of investigation.

That's just my first pass thinking though.

But, in the worst case scenario (big export by disgruntled staff), we might have legal requirements to notify effected individuals, so we might want the list in that case as well!

We should consider adding a message saying that all exports are logged - that might head it off at the pass

Sounds like we should try logging all the ids and see where we end up. I don't have a good sense of how often we have big exports happening, but we should definitely expect some lists of ids that would extend into a few MB for an individual export.

@greg how soon do you want us to investigate this? next few sprints, this FY or next FY?

No rush. We can totally park it for later. The use-case has been long standing and intermittent so no immediate driving force, just Something We Should Do(TM).