(@JScherer-WMF please add a relevant illustration to this task)
Per T402964, the Visual Table of Contents component will switch to a 2-column layout on larger viewports (above 640px wide). This grid is not a series of equal-height rows – instead, the design calls for 2 columns where variable-height items can be stacked in an alternating pattern. Breaking out of a strict boxy pattern is the design goal, and we are explicitly embracing the fact that items will have varying height depending on the amount of text that accompanies a given image here.
In order to achieve this effect, we probably want to rely on CSS Grid to define a grid with two vertical "tracks", where items of varying height can be added. Maybe this is possible with Flexbox as well. Ideally the solution is achievable in pure CSS.