From @Esanders :
With the small set of checks currently setup on enwiki, this is already causing typing lagging on long articles, e.g.
https://en.wikipedia.org/w/index.php?title=Moon&veaction=edit&ecenable=2
Moon - Wikipedia (325 kB)It takes about 500ms for all the checks to run on my machine, which is too high for an onDocumentChange check
I think the forEachRunOfContent is slowing us down a little bit, but it isn't the bottleneck: 700ms locally, vs 500ms if I just search the entire document. If we just avoid \n in the search result that would have the same effect (not allowing multi-paragraph matches) (edited)
But the main problem is that doing 10 million Set.has calls is slowPossible approaches:
- Go back to building a regex - might be hard to support locale case insensitivity - will need to do some escaping (RegExp.escape polyfill)
- Use Aho-Corasick: https://github.com/BrunoRB/ahocorasick Seems to build character based tables, so could possible be modified for case/local case insensitivity
Testing on the same example as above (local 450k doc with 100 words):
- Regex: ~20ms
- Aho-Corasick: ~50ms