I've rebooted cache_text yesterday as part of T155401 and observed a few 503 spikes although the reboots were depooled and with enough spacing between each server. Today I've tracked down the issue and found that the errors are reproducible with a simple backend VCL reload performed with /usr/share/varnish/reload-vcl. Note that there is no need for actual VCL changes in order to reproduce the behavior, a noop reload would also make the varnish backend throw 503s for about one second.
The issue is known https://github.com/varnishcache/varnish-cache/issues/2195 and is due to dynamic backends introduced in 4.1. Before 4.1 different VCLs were sharing the same backends, including the health probes and their status. Starting with 4.1 instead, different VCLs have their own backends, meaning that a new VLC will start with all the probes reset. Our probe settings are such that backends are considered sick upon startup, and need one successful poll in order to become healthy.
This might shed some light on T154801 too, as all backends reloading (and thus returning 503s at the same time) could have triggered the vslp error resulting in varnishd crashing.