startingDeadlineSeconds is a setting that defines how long after the scheduled time the controller can still schedule a run of a CronJob (see Deadline for delayed Job start - Kubernetes documentation).
This affects key aspects of running CronJobs:
Setting concurrencyPolicy to Forbid
Without setting startingDeadlineSeconds, the CronJob will "miss" a scheduling every 10s starting at its next scheduled time. If the job overruns its next scheduling time by 1000s, the CronJob controller will stop trying to schedule execution of that CronJob because it will have missed scheduling 100 times.
If startingDeadlineSeconds is set, the controller only looks at the number of scheduling failures inside of the time window between the scheduled start time and (scheduled start time + startingDeadlineSeconds).
If the controller can't start the Job within startingDeadlineSeconds it will skip that execution, but I am unsure if it creates a Job and marks it as failed, or just logs a failed execution at the CronJob level (my gut says the second, but we need to verify that).
Suspending CronJobs without setting startingDeadlineSeconds
From Schedule suspension - Kubernetes documentation
Executions that are suspended during their scheduled time count as missed Jobs. When .spec.suspend changes from true to false on an existing CronJob without a starting deadline, the missed Jobs are scheduled immediately.
It is unclear from the documentation if this means it will try to schedule all the Jobs that were missed in the suspension interval, or only the last one. If startingDeadlineSeconds is set, it will presumably have skipped the previous executions, and will only start the Job immediately if within the (last scheduled start time + startingDeadlineSeconds) window.
This also means (indirectly related) that if a CronJob without a startingDeadlineSeconds setting is suspended for longer than 100x its scheduled interval, it will only restart if the CronJob object is deleted and recreated.
Current situation
Currently, we do not set that value, and the default concurrencyPolicy is Replace.
With these settings, a job overrunning its interval will not complete, and will get replaced by a new instance (see T394018: Link Recommendation Task pool data missing for some wikis for instance).
If the Job doesn't do checkpointing, it will presumably never complete.
T394409: Add a way to suspend CronJobs allows suspending a job without care for startingDeadlineSeconds being set or not