Both Tesseract and the Google OCR suffer badly on "old" printing styles as often found around the 18th century. This is usually a combination of:
- Fairly distinctive fonts which aren't common for bulk text in later years
- Frequent use of long-s and ligatures that confuse OCR engines (long-s especially looks like 'f')
- Poor printing quality due to the technology of the times
- Strong paper color leading to lower contrast due to the paper technology and age effects
However, the text from this period is actually all rather similar, so a modified OCR model may be able to improve all of it.
An example of a page that OCRs fairly badly is this (it's not the worst example, but it's typical)