As a data scientist I would like to have a way of sharing Jupyter notebooks containing confidential data with members of my team and others in the Foundation.
There is currently no way to do this.
- Google Colab isn't part of Google Workspace (formerly G Suite)
- Google Drive doesn't support/render ipynb files
- Exporting notebook to PDF via JupyterLab's interface produces horrible output
- Exporting notebook to HTML doesn't work since GDrive doesn't support HTML files
- Exporting notebook to HTML and converting to PDF via web browser produces bad output (although not quite as bad as direct to PDF via Jupyter)
- Uploading notebook to a private GitHub repo involves cumbersome permission management and adding users as collaborators
Since we already have a system in place for gated access (superset.wikimedia.org requires logging in through Wikimedia Developer SSO and membership in 'wmf' or 'nda' groups), we could have a gated and internally-hosted nbviewer service.
Perhaps the stat100X hosts can have a directory similar to /srv/published where these notebooks can be manually copied/rsynced to, only instead of becoming publicly accessible via analytics.wikimedia.org/published/, they would be accessible via a (gated) nbviewer-internal.wikimedia.org (for example).
Alternative, interim solutions
- people.wikimedia.org (see https://wikitech.wikimedia.org/wiki/People.wikimedia.org#How_To_add_SSO_Authentication)
- Google Sites (see https://www.ericekholm.com/posts/2021-04-26-publishing-rmarkdown-to-google-sites/)
And as Martin wrote in T290693#7344878:
The Google Site works as well, but it can only be used within the Foundation (for instance, you wouldn't be able to share information with NDA'ed volunteers or WMDE staff).