Text that has been processed (cleaned and segmented) should be cached and not processed again, unless the it has been altered. It's is possible that this type of caching is already done by MediaWiki (see https://www.mediawiki.org/wiki/Manual:Caching).
First thought on test:
- Two different browsers visiting one after another. Only the first should trigger the log message in cleaner/segmenter.
- Second user updates page content. Log should trigger once for the second user.
If passes then all is well. If not then investigate various caching settings/mechanisms in MediaWiki and set up a task for implementation.