After experimenting with vagrant-spec for functional testing of our plugin, it proved a bit too unstable (it is marked as highly experimental after all). Using a combination of low-level, atomic unit tests (see T76627) for the core classes/modules of our plugin, and select high-level acceptance tests using Cucumber may actually yield better coverage with less fragility.
Ideally, we'd be able to execute the latter as a gate-and-submit job.