I want to schedule a cron job to update and publish a jupyter notebook on notebook1003/1004 daily, but it's impossible to publish automatically with the current publishing solutions. As a workaround, we can setup something on the notebook hosts like /srv/published-datasets on the stat* boxes, so that html files generated from jupyter notebooks can be copied to this directory and then share with the public. This would make automatic publishing from the same notebook host possible because no password is required.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Sync /srv/published-datasets from SWAP hosts | operations/puppet | production | +119 -71 |
Event Timeline
Change 494501 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Sync /srv/published-datasets from SWAP hosts
Change 494501 merged by Ottomata:
[operations/puppet@production] Sync /srv/published-datasets from SWAP hosts
@chelsyx you should now be able to put files in /srv/published-datasets on notebook hosts. Files there will eventually show up at https://analytics.wikimedia.org/datasets. You should be able to manually run the sync by running published-datasets-sync on a notebook (or stat) host.
Awesome, please @chelsyx would you be so kind to document here: https://wikitech.wikimedia.org/wiki/SWAP#Sharing_Notebooks?
Hi @Ottomata , published-datasets-sync works well when I (user chelsyx) execute it. But when I execute it via a cron job, I will see published-datasets-sync: command not found. Do you have any idea why this is happening?
It is in /usr/local/bin, perhaps your PATH doesn't contain it!
But! You shouldn't need to execute it in a cron, it already runs every 15 minutes in another cron.