When domains expire they sometimes get "squatted" by domain reselling pages. These are unsuitable as references for encyclopaedia articles (except possibly in an article about domain squatting, but that's a niche case).
Where possible, IABot should
- detect a site as a reselling page and mark it as dead
- not link to archives of the domain reselling page
In my experience, the following strings as the page title reliably indicate that the page is a domain reselling page:
- "This website is for sale"
- "Deze website is te koop"
- "HugeDomains.com"
- "Denna sida är till salu"
- "available at DomainMarket.com" [this is the tail end of the string which typically includes the domain name]
The following strings indicate the page is not the original content, but they are not necessarily domain reselling pages
- "主婦が消費者金融に対して思う事"
- "page not found"
- "ACTUAL ARTICLE TITLE BELONGS HERE"
- "Website disabled"
These are obviously non-exhaustive lists and false positives are not impossible but will be very rare.