Currently, each pipeline creates its own document and, at various stages, the results of those parses need to be adopted by the main document.
We expect some performance gains by eliminating that work.
See https://gerrit.wikimedia.org/r/#/c/385312/
Also, there're a few instances where we have a dummyDoc to avoid string concatenation which should make use of it.