In order to eventually come up with appropriate resource limits for the parsoid parser, we first need to understand how Parsoid's memory usage scales with various wikitext measures.
The first step is probably to construct various synthetic benchmarks, with text of various sizes, lists of various sizes, tables of various sizes, figures of various sizes, etc, so we can determine where to set resource limits. These would be 'clean' scaling numbers.
Alternatively we could start collecting statistics on real-world wikitext, and try to do a "big data" numerical analysis on the 'dirty' data to determine (a) which properties of input wikitext are strong predictors of CPU time usage and memory usage, and (b) what typical ranges of these properties are for existing articles.
The goal is to get some confidence that we can (say) set limits on input wikitext to X total bytes, Y lists, Z table cells, etc, and have Parsoid almost certain to complete processing in less than A CPU seconds and B MB heap size, and that 99.<some number of nines>% of existing wikitext content falls under these limits.