Page MenuHomePhabricator
Authored By
BBlack
May 1 2019, 1:40 PM
Size
878 B
Referenced Files
None
Subscribers
None
Based on a short but probably-representative sample, these are all the hostnames that rounded to >=1% of all address (A/AAAA) queries. All samples were collected during one brief window of time. Probably the prevelance of language subdomains follows the sun, but that's not the kind of data we're looking for here (more like domain roots and common targets vs all the project/lang entries). Between them they totaled ~57%, leaving ~43% for the long tail:
1% pt.wikipedia.org.
1% ja.wikipedia.org.
1% es.wikipedia.org.
1% fr.wikipedia.org.
1% de.wikipedia.org.
1% maps.wikimedia.org. # Note upload-addrs, not text-addrs
1% ru.wikipedia.org.
1% commons.wikimedia.org.
1% zh.wikipedia.org.
2% wikimedia.org.
3% wikipedia.org.
7% meta.wikimedia.org.
8% login.wikimedia.org.
8% www.wikipedia.org.
9% en.wikipedia.org.
11% upload.wikimedia.org. # Note upload-addrs, not text-addrs

File Metadata

Mime Type
text/plain; charset=utf-8
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
7390032
Default Alt Text
raw.txt (878 B)

Event Timeline