Several busy ('hanging') workers in production were backtracking when parsing pathological tables in http://el.wikipedia.org/wiki/%CE%A0%CE%BF%CF%81%CE%B5%CE%AF%CE%B1_%CF%84%CF%89%CE%BD_%CE%BA%CF%85%CF%80%CF%81%CE%B9%CE%B1%CE%BA%CF%8E%CE%BD_%CE%BF%CE%BC%CE%AC%CE%B4%CF%89%CE%BD_%CF%83%CF%84%CE%B1_%CE%BA%CF%8D%CF%80%CE%B5%CE%BB%CE%BB%CE%B1_%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7%CF%82
I tracked this down by attaching the node debugger to those workers.
Backtracking when parsing table cells with optional attributes is hard to avoid, but in this case there might be a bug in cache key construction for memoization. The presence of plenty of quotes additionally slows down potential-attribute parsing here.
I have some WIP code that speeds things up a lot by avoiding to parse attributes with clearly invalid names, but get some failures in tests where the PHP parser simply strips invalid attribute names. Needs further investigation.