Page MenuHomePhabricator
Paste P13011

ezachte's crontab
ActivePublic

Authored by elukey on Oct 16 2020, 5:59 AM.
Tags
None
Referenced Files
F32386261: ezachte's crontab
Oct 16 2020, 5:59 AM
Subscribers
None
****ezachte*****
# For more information see the manual pages of crontab(5) and cron(8)
# in December/January have mediacounts file productionized or ask manual mkdir/chmod of ../2018, see phabricator T173724
# Q&D overview of scripts and html files, August 2017, to be further documented on meta (now in crontab, as this will be file to be examined, when I'm hit by a bus)
# cgi-bin scripts
# http://www.infodisiac.com/cgi-bin/WikimediaDownload.pl Concise version of the Wikimedia database dump service report (internal use)
# https://stats.wikimedia.org/cgi-bin/search_portal.pl?search=views Search Wikistats portal
#
# Visualizations:
# https://stats.wikimedia.org/wikimedia/animations/growth/index.html Animated growth figures per Wikimedia project
# https://stats.wikimedia.org/wikimedia/animations/requests/ Wikipedia edits for a normal day in May 2011
# https://stats.wikimedia.org/wikimedia/animations/pageviews/wivivi.html ~/wikistats/traffic
#
# Manually started scripts
# http://infodisiac.com/Wikimedia/Visualizations/ ~/wikistats/image_sets ????? explore
# https://stats.wikimedia.org/wikimedia/squids/SquidReportsCountriesLanguagesVisitsEdits.htm
# ~/wikistats/squids
# https://dumps.wikimedia.org/other/media/ WLA/WLE/WLM contest winners ~/wikistats/image_sets
#
# plain html
# https://stats.wikimedia.org/ ~/wikistats/portal
# http://infodisiac.com/Wikimedia/Visualizations/ ~/wikistats/viz_gallery
MAILTO=ezachte@wikimedia.org
SHELL="/bin/sh"
LANG='C'
LC_CTYPE="C"
LC_MESSAGES="C"
LC_ALL="C"
WIKISTATS_SCRIPTS=/home/ezachte/wikistats
WIKISTATS_DATA=/home/ezachte/wikistats_data
WIKISTATS_BACKUP=/home/ezachte/wikistats_backup
# m h dom mon dow command
# === ACTIVE AFTER MOVE stat1002 -> stat1005 ===
# dont' lose all of this after next server re-install (again)
0 0 * * * nice /home/ezachte/wikistats/backup/backup_sysfiles.sh
10 1 * * * nice /home/ezachte/wikistats/backup/backup_portal.sh >/dev/null 2>&1
20 1 * * * nice /home/ezachte/wikistats/backup/backup_scripts.sh >/dev/null 2>&1
# collect newewest projectcounts files (hourly page view stats per wiki), add to tar, and publish
# using DammitSyncProjectCounts.pl + upd WikiCountsJobProgress[Current].html (see above)
0 2 * * * nice /home/ezachte/wikistats/dammit.lt/bash/dammit_sync.sh >/dev/null 2>&1
# generate view counts per day/week/month, based on project[counts|views]-yyyymmdd-hhnnss files
# see https://stats.wikimedia.org/EN/TablesPageViewsSitemap.htm
0 3 * * * nice /home/ezachte/wikistats/dammit.lt/bash/dammit_projectviews_monthly.sh >/dev/null 2>&1
# compact hourly pagerequest files into daily file, then once a month into monthly file, publish output
# daily script invokes monthly script which then actually runs once a month
0 4 * * * nice /home/ezachte/wikistats/dammit.lt/bash/dammit_compact_daily.sh >/dev/null 2>&1
# update mail stats (several times per day, for moderators)
15 1,7,13,19 * * * nice /home/ezachte/wikistats/mail-lists/bash/report_mail_lists_counts.sh >/dev/null 2>&1
# daily minor maintenance tasks for scripts that process dumps
# sort dblists (one for each project), smallest wikis first
10 13 * * * /home/ezachte/wikistats/dumps/bash/sort_dblists.sh >/dev/null 2>&1
# wipe temporary files from Wikistat count job after x days
10 23 * * * /home/ezachte/wikistats/dumps/bash/wipe_counts_temp.sh >/dev/null 2>&1
# for each wiki query API to find which namespaces are deemed content (and hence countable)
20 13 * * * /home/ezachte/wikistats/dumps/bash/collect_countable_namespaces.sh >/dev/null 2>&1
# note: the main jobs to process dumps are ran manually
# /home/ezachte/wikistats/dumps/bash/count_report_publish.sh wp # Wikipedia project
# /home/ezachte/wikistats/dumps/bash/count_report_publish.sh nonwp # all other projects
# after counts have been collected English reports are generated and written to ../draft/.. location
# after manual vetting reports are written in all supported languages to final location
# with /home/ezachte/wikistats/dumps/bash/report_all.sh
# produce daily zip file with top 1000 viewed media files (one csv file per column)
# for public download https://dumps.wikimedia.org/other/mediacounts/daily/
5 14 * * * nice /home/ezachte/wikistats/mediacounts/bash/mediacounts_rankings_add_days.sh >/dev/null 2>&1
# prep http://stats.wikimedia.org/WikiCountsJobProgress[Current].html
# refresh often
0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57 * * * * /home/ezachte/wikistats/dumps/bash/progress_wikistats.sh >/dev/null 2>&1
# refresh less often
# 0,15,30,45 * * * * /home/ezachte/wikistats/dumps/bash/progress_wikistats.sh >/dev/null 2>&1
# === NOT YET ACTIVE AFTER MOVE stat1002 -> stat1005 ===
# backup scripts and data
#- 0 3 * * * /home/ezachte/wikistats/dumps/bash/backup_daily.sh >/dev/null
#- 0 4 * * 1 /home/ezachte/wikistats/dumps/bash/backup_weekly.sh >/dev/null
#- 0 5 7 * * /home/ezachte/wikistats/dumps/bash/backup_monthly.sh >/dev/null
#* * * * * /a/wikistats_git/squids/bash/backup.sh >> /a/wikistats_git/squids/bash/backup.log
# === NOT YET ACTIVE AFTER MOVE stat1002 -> stat1005 / NOT SO URGENT ===
# generate daily updates for WLE contest (and later WLM, WLE)
# 0 * * * * /home/ezachte/wikistats/dammit.lt/bash/dammit_list_articles_WLA.sh >/dev/null 2>&1 # till end of November (contest ends Nov 30)
# weekly scripts
#- 0 5 * * 1 perl /a/wikistats_git/dumps/perl/WikiCountsRankPageHistory.pl >/dev/null
# monthly update animations at http://stats.wikimedia.org/wikimedia/animations/growth/AnimationProjectsGrowth??.html (?? = Wb Wk Wn Wo Wp Wq Ws Wv Wx)
#- 0 7 28 * * /a/wikistats_git/dumps/bash/count_prep_animations.sh
#- 0 8 1 * * /a/wikistats_git/dumps/bash/merge_editors_all_wikis.sh
# collect stats on wikistats site itself
#30 4 * * * hive < /home/ezachte/wikistats.hql| sed 's/,/;/g' | sed 's/\t/,/g' > /home/ezachte/wikistats-hql.csv
# hourly scripts
#- 0 0,2,4,6,8,10,12,14 * * * /home/ezachte/wikistats/traffic/bash/download_scripts.sh # copy zip file with newest scripts and data from thorium # now every even likely work hour on CE(S)T
# get dummy page from Wikipedia; log timestamp, duration and result
#- * * * * * perl /home/ezachte/wikistats/squids/perl/SquidMonitorTraffic.pl >/dev/null 2>&1
# === NO LONGER ACTIVE ===
# daily run of wikistats for staff wikis
# 0 7 * * * /home/ezachte/wikistats/dumps/bash/count_report_publish_wmf.sh >/dev/null 2&>1
# === DEPRECATED ===
# generate input for Limn, deprecated - https://analytics.wikimedia.org/dashboards/reportcard/
# 0 6 * * 1 /home/wikistats/analytics/bash/analytics_upd.sh >/dev/null # see ../analytics/read.me
# 0 11 * * * /home/ezachte/wikistats/dammit.lt/bash/dammit_project_medicin_progress_report.sh