It would be good to have our security response to a few common incident scenarios documented, so we don't have to always figure things out on the fly. If this is already documented somewhere, awesome, I just haven't found it yet. It's probably just a case of compiling stuff we have documented in different place into one place, I would guess.
Specific scenarios,
- Compromised user account - where are all of the places we should check to revoke access? what is the process to kill all current ssh sessions for someone with shell access? If someone has access to root passwords, what is the process for changing the password? If someone had access to the private mediawiki repo, what's the process for rotating all of those secrets?
- Compromised server - What tools should we use for full memory dumps, and getting a forensic copy of the data? Where can we store forensic images safely for analysis? What are the options / what is the process for network isolation between the time that we suspect compromise to having the disk/memory dumps saved? Where should we document the incident timeline during an ongoing investigation?
Once this is documented, I'd like to do a fire-drill and test the response process sometime in Q3.