Similar to T305662, to get the most confidence out of our visual regression tests, our database would have a subset of actual data from a production db including articles, templates, and gadgets.
Back in 2015, there was great work done in T120345 for visual regression testing that generated an XML dump from a number of wikis: https://dumps.wikimedia.org/other/testfiles/20160405/ . This work might be relevant to our efforts in setting up a production-like environment for our visual regression tests.
=== Question we are trying to answer
- Meet with @ssastry and determine how feasible it is to use the dump files from 2015 (does it have all the article text, templates, etc necessary to show a production-like page when imported into a MariaDB database. Compare this approach against using the content provider (or perhaps some other strategy).