Text should be segmented by some tags, even if they don't end in full stop. Currently these should include:
- Headers (<h1>, <h2> ...)
- <p>
- <br />
Later, when/if they're not removed, they should include:
- <li>
- <td>/<th>
This should be configurable, like removed tags.