Most (but not all) of these errors were found to correlate to pages with either high image counts (> 1000) or high heading counts (> 1000). Therefore, if we want these pages to not timeout anymore, we could try skipping the MobileFormatter altogether under these scenarios.
- Each of the cases I looked at had extremely high total element counts (> 50,000 total elements on the page). The suggested fix may remedy the majority of the timeout errors, however client side performance on mobile devices could still be pretty abysmal (e.g. slow/unuseable repaints, scroll freezing, etc). Therefore, it's probably not wise to spend a lot of time implementing this fix as it's really just a bandaid for the majority of cases.
- Per Timo, image count might already be accessible from the ParserOutput object. This would be preferred over a regex since it would be a cheaper check.
- For section count, using preg_match_all on the HTML string might work well. The downside of doing the regex is that there will be a cost added even for pages that are within the limits (majority of pages). Be sure to measure the cost of adding this regex! If the cost is too high, consider just checking the image count and foregoing the section check.
- When a page has > 1000 images or > 1000 headings, don't run the MobileFormatter. Be conservative. Check these numbers with Nick- make as high as possible to start with.
- The number of images/headings should be configurable.
- Document the savings of execution time on big pages
- Click through some pages on mobile on the beta cluster. Check pages with headings have collapsible headings. After verifying this for a few pages move to sign off.
Sign off steps
- A developer needs to check logstash and the impact on "time out" errors after this has been deployed. If it's significant report it to wikitech.