Page MenuHomePhabricator

Temporary measurement of outbound citation link clicks
Closed, ResolvedPublicFeature

Description

As part of analyzing the impact of archive.today on readers of Wikipedia, following the enwiki blacklisting of that source, and other community efforts to respond, we would like to better understand the rate of outbound traffic through citation links.

To do this, our plan is, for some period of time, to collect data on what citation links are clicked by logged-out readers, across enwiki and other wikis. We'll add to this ticket with specifics about the time period and specific wikis.

Our current plan is to do this through client-side instrumentation (JS) rather than a server-side redirector, since it doesn't need to be perfect or change anything about what happens.

Prior research

https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Citation_Usage
T171231: [Objective 11.1.2] Research on citation/external link usage

Proposed implementation

Use Prometheus instead of Event Logging, so that events are not associated with an individual user. We do not need or want to connect clicks with any individual session; we are interested to see data in aggregate.

  • Use StatsFactory to emit events to Prometheus, bucketed by domain

Event Timeline

In The Wikipedia Library program, publishers who we partner with often ask us how much traffic we're seeing to their content, or how much they could expect. It would be helpful if, as part of this project, we could also collect data for some TWL partner URLs. That is, unless this would be blanket data collection and we could just get the chance to do some analysis on it regardless :)

Yeah, to be clear the idea here is to collect data on all (or at least above some popularity threshold) outbound citation links, not just ones to a particular service.

Change #1251274 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[mediawiki/extensions/WikimediaEvents@master] [WIP] Instrument clicks on external links to selected domains

https://gerrit.wikimedia.org/r/1251274

I gathered a list of 1000 most common links in enwiki articles. We may use it to determine what domains are worth instrumenting:

115480448 web.archive.org
26138892 google.com
33313764 toolforge.org
43179105 doi.org
52430749 nih.gov
61455787 search.worldcat.org
71189828 archive.org
81075699 jstor.org
91042983 viaf.org
101011549 nasa.gov
11937985 minorplanetcenter.net
12921510 loc.gov
13846312 newspapers.com
14763442 api.semanticscholar.org
15745567 bbc.co.uk
16737235 nytimes.com
17714243 harvard.edu
18624515 isni.org
19583596 youtube.com
20574096 id.oclc.org
21550150 d-nb.info
22520717 gbif.org
23518187 imdb.com
24488661 yale.edu
25488172 billboard.com
26421089 catalogueoflife.org
27417292 nli.org.il
28412492 theguardian.com
29408802 tree.opentreeoflife.org
30405024 musicbrainz.org
31372359 nla.gov.au
32359019 census.gov
33357798 nps.gov
34345733 data.bnf.fr
35343960 catalogue.bnf.fr
36341874 idref.fr
37327131 irmng.org
38311280 inaturalist.org
39296377 amigo.geneontology.org
40286499 eol.org
41270016 allmusic.com
42258949 soccerway.com
43258761 id.worldcat.org
44258423 espncricinfo.com
45246575 data.bibliotheken.nl
46233721 historicengland.org.uk
47233164 archive.today
48232539 aleph.nkp.cz
49203398 espn.com
50201650 pro-football-reference.com
51196375 openlibrary.org
52195229 itis.gov
53193817 baseball-reference.com
54191730 indiatimes.com
55191137 bbc.com
56187586 deutsche-biographie.de
57183714 iucnredlist.org
58178500 twitter.com
59167970 washingtonpost.com
60165775 variety.com
61165228 latimes.com
62159361 hdl.handle.net
63157246 uefa.com
64155173 marinespecies.org
65153922 nhl.com
66152561 deutsche-digitale-bibliothek.de
67151695 arxiv.org
68150933 bn.org.pl
69142984 officialcharts.com
70142449 deadline.com
71141923 thegazette.co.uk
72141624 sports-reference.com
73140543 snaccooperative.org
74138329 biolib.cz
75138319 facebook.com
76135112 worldcat.org
77134971 discogs.com
78130791 reuters.com
79127365 cnn.com
80124971 legislation.gov.uk
81124418 nationalmap.gov
82124189 animenewsnetwork.com
83122316 telegraph.co.uk
84122242 ipni.org
85121994 science.kew.org
86121421 bench.boldsystems.org
87120295 authority.bibsys.no
88115506 datos.bne.es
89115475 ghostarchive.org
90113514 independent.co.uk
91113375 hollywoodreporter.com
92112365 apple.com
93112001 proquest.com
94110998 nhm.ac.uk
95110092 olympedia.org
96104583 list.worldfloraonline.org
97103560 tropicos.org
98103064 biodiversity.org.au
99102268 nii.ac.jp
100101556 insee.fr
101101413 nba.com
10299913 cricketarchive.com
10398707 opac.sbn.it
10494241 thehindu.com
10593707 abc.net.au
10693129 biodiversitylibrary.org
10791957 cbc.ca
10890396 deepl.com
10986520 soccerbase.com
11085446 olympics.com
11184739 mlb.com
11284678 ebi.ac.uk
11382175 basketball-reference.com
11481527 usatoday.com
11581009 instagram.com
11679610 worldradiohistory.com
11778746 zap2it.com
11877674 rollingstone.com
11976781 amar.org.ir
12074503 researchgate.net
12173682 rsssf.org
12272588 britishnewspaperarchive.co.uk
12371641 opac.kbr.be
12470681 metacritic.com
12570021 ndl.go.jp
12669638 rottentomatoes.com
12769544 noaa.gov
12869514 fcc.gov
12968636 theplantlist.org
13068130 naver.com
13167778 apnews.com
13265873 forbes.com
13365686 amazon.com
13465418 mathscinet.ams.org
13562979 natlib.govt.nz
13662485 worldfootball.net
13760838 smh.com.au
13860647 npr.org
13960170 kopkatalogs.lv
14060093 babel.hathitrust.org
14158519 insecta.pro
14258514 yahoo.com
14358406 ign.com
14458365 observation.org
14557564 iaaf.org
14656998 getty.edu
14756919 ultratop.be
14856788 national-football-teams.com
14956333 nfl.com
15054739 bloomberg.com
15154598 rcsb.org
15254490 britannica.com
15354421 wsj.com
15453765 uniprot.org
15553564 libris.kb.se
15653459 showbuzzdaily.com
15753423 congress.gov
15852696 ew.com
15952687 nga.mil
16052621 pitchfork.com
16152568 portal.historicenvironment.scot
16252528 stat.gov.pl
16352102 paleobiodb.org
16452000 indianexpress.com
16551949 oricon.co.jp
16651232 academia.edu
16750696 time.com
16849661 cantic.bnc.cat
16949610 chicagotribune.com
17049483 usgs.gov
17149004 aljazeera.com
17248443 nl.go.kr
17348359 racingpost.com
17448025 itftennis.com
17547913 nme.com
17647564 tools.wmflabs.org
17747260 irishtimes.com
17847097 x.com
17946852 boxofficemojo.com
18046813 ars-grin.gov
18146607 oxforddnb.com
18246590 worldathletics.org
18346542 stuff.co.nz
18446285 independent.ie
18546272 explorer.natureserve.org
18645982 racing-reference.info
18745818 fiba.basketball
18845227 fifa.com
18945046 justia.com
19044679 cbsnews.com
19142606 rkd.nl
19242559 bugguide.net
19342404 species.nbnatlas.org
19442368 hindustantimes.com
19541639 rte.ie
19640998 globo.com
19740750 kicker.de
19840569 nbcnews.com
19940323 nzherald.co.nz
20040218 skysports.com
20140175 gd.eppo.int
20240051 censusindia.gov.in
20340025 retrosheet.org
20439289 nzor.org.nz
20538623 sciencedirect.com
20638418 efloras.org
20738152 animaldiversity.org
20837829 tufts.edu
20937681 nlb.gov.sg
21037467 gamespot.com
21137183 mtv.com
21237146 people.com
21337142 indiatoday.in
21436573 bgee.org
21536048 fauna-eu.org
21635989 dx.doi.org
21735828 thetimes.com
21835401 archives.gov
21935352 offiziellecharts.de
22035328 openstreetmap.org
22135217 ed.gov
22235004 genenames.org
22334606 gaonchart.co.kr
22434504 bnportugal.gov.pt
22534199 data.gouv.fr
22633948 bizjournals.com
22733909 dutchcharts.nl
22833643 intersportstats.com
22933563 huffingtonpost.com
23033560 go.com
23133546 ft.com
23233408 philstar.com
23333283 findagrave.com
23433255 scopus.com
23533096 playbill.com
23633040 british-history.ac.uk
23733015 afi.com
23832919 eu-nomen.eu
23932843 riaa.com
24032837 natalie.mu
24132683 springer.com
24232308 cbssports.com
24331991 politico.com
24431988 hls-dhs-dss.ch
24531657 cassi.cas.org
24631621 si.com
24731220 thedailystar.net
24831189 theglobeandmail.com
24931160 eea.europa.eu
25030941 orcid.org
25130812 gutenberg.org
25230751 sfgate.com
25330699 obis.org
25430590 gov.uk
25530588 gallica.bnf.fr
25630504 tvtonight.com.au
25730471 mlssoccer.com
25830414 commonchemistry.cas.org
25930319 dawn.com
26030228 tandfonline.com
26129979 nsw.gov.au
26229373 issuu.com
26329350 eliteprospects.com
26429274 ucsc.edu
26529225 newindianexpress.com
26629124 abs.gov.au
26729010 cagematch.net
26828961 catalogue.nlg.gr
26928770 fda.gov
27028627 baseball-almanac.com
27128442 atptour.com
27228250 wiley.com
27328228 procyclingstats.com
27428038 afltables.com
27528015 digitalspy.com
27628008 ensembl.org
27727762 thestar.com
27827532 newyorker.com
27927416 mathgenealogy.org
28027406 ibdb.com
28127373 flightglobal.com
28227358 cambridge.org
28327293 theage.com.au
28427293 wsc.nmbe.ch
28527264 github.com
28627256 epa.gov
28727199 scmp.com
28827187 psu.edu
28927155 katalog.nsk.hr
29027083 pbs.org
29126924 australianfootball.com
29226883 zbmath.org
29326845 legacy.com
29426825 hockeydb.com
29526633 goal.com
29626563 cyclingnews.com
29726561 psa.gov.ph
29826559 fivb.org
29926558 catholic-hierarchy.org
30026556 aria.com.au
30126552 indexfungorum.org
30226481 anu.edu.au
30326245 straitstimes.com
30426242 allmovie.com
30526071 elpais.com
30625970 abs-cbn.com
30725850 performing-arts.eu
30825810 snl.no
30925792 eci.nic.in
31025774 swisscharts.com
31125720 aviation-safety.net
31225716 avclub.com
31325707 gmanetwork.com
31425636 flickr.com
31525621 speciesfungorum.org
31625590 c-span.org
31725429 nydailynews.com
31825228 iihf.com
31925122 wwe.com
32025044 newspaperarchive.com
32124961 stolaf.edu
32224919 blabbermouth.net
32324781 timesofisrael.com
32424711 govinfo.gov
32524657 paralympic.org
32624582 mycobank.org
32724542 australian-charts.com
32824537 11v11.com
32924485 eu-football.info
33024459 newsbank.com
33124262 upi.com
33224248 thefutoncritic.com
33324101 businessinsider.com
33424056 courtlistener.com
33524048 archive.ensembl.org
33623907 foxnews.com
33723866 haaretz.com
33823760 uboat.net
33923647 fishbase.org
34023625 wikipedialibrary.wmflabs.org
34123469 jpost.com
34223423 nic.funet.fi
34323201 usda.gov
34423076 irishexaminer.com
34523065 chemspider.com
34622889 bac-lac.gc.ca
34722713 spotify.com
34822592 screenrant.com
34922340 ourcampaigns.com
35022264 cnbc.com
35122250 bpi.co.uk
35222246 bdfutbol.com
35322245 tvguide.com
35422227 spacedys.com
35522206 xeno-canto.org
35622193 treccani.it
35722180 the-afc.com
35822172 bfs.admin.ch
35922144 thehill.com
36022136 vice.com
36121987 wired.com
36221855 publishersweekly.com
36321787 techcrunch.com
36421782 omegatiming.com
36521732 baltimoresun.com
36621694 indiewire.com
36721679 theverge.com
36821653 aotearoamusiccharts.co.nz
36921570 americanradiohistory.com
37021569 omim.org
37121536 nj.com
37221501 api.parliament.uk
37321420 rappler.com
37421401 digitalspy.co.uk
37521333 osmaps.com
37621295 austriancharts.at
37720949 highbeam.com
37820816 rugbyleagueproject.org
37920694 polygon.com
38020616 programminginsider.com
38120609 nature.com
38220598 rnz.co.nz
38320552 ukwhoswho.com
38420513 news.com.au
38520480 grammy.com
38620352 biogps.org
38720328 ucsb.edu
38820294 mapdata.ru
38920266 treatment.plazi.org
39020079 cornell.edu
39120070 whc.unesco.org
39220021 tuik.gov.tr
39319998 eurogamer.net
39419998 culture.gouv.fr
39519954 ballotpedia.org
39619954 oup.com
39719890 theatlantic.com
39819873 e-icisleri.gov.tr
39919663 statcan.gc.ca
40019548 reptile-database.reptarium.cz
40119524 tms.fih.ch
40219468 wikisky.org
40319463 lemonde.fr
40419344 seattletimes.com
40519343 hockey-reference.com
40619333 popmatters.com
40719318 historyofparliamentonline.org
40819306 fishbase.ca
40919303 unicode.org
41019208 legaseriea.it
41119174 wtatennis.com
41219152 syr.edu
41319114 charts.nz
41419089 afl.com.au
41519071 broadwayworld.com
41619012 pwtorch.com
41718978 sealifebase.ca
41818897 archives.parliament.uk
41918847 rism.online
42018836 tribune.com.pk
42118801 standard.co.uk
42218707 vulture.com
42318678 afromoths.net
42418625 eurovision.tv
42518610 navy.mil
42618500 swedishcharts.com
42718424 billboard-japan.com
42818400 citypopulation.de
42918382 mobygames.com
43018340 prnewswire.com
43118334 dw.com
43218311 thestar.com.my
43318293 allroutes.ru
43418284 economist.com
43518199 boxrec.com
43618177 pep.ph
43718117 radiotimes.com
43818103 collider.com
43918074 nobelprize.org
44018023 ndtv.com
44117971 bollywoodhungama.com
44217936 emmys.com
44317800 zoobank.org
44417760 amazon.co.uk
44517641 catalog.hathitrust.org
44617604 premierleague.com
44717536 taipeitimes.com
44817488 rediff.com
44917426 cbr.com
45017369 lescharts.com
45117292 heraldscotland.com
45217230 africanplantdatabase.ch
45317222 un.org
45417193 fis-ski.com
45517186 tff.org
45617133 historicplaces.ca
45717108 mlbtraderumors.com
45817029 zenodo.org
45917019 kulturnav.org
46016986 spiegel.de
46116985 jhu.edu
46216875 nypost.com
46316868 marca.com
46416852 newsweek.com
46516851 mmajunkie.com
46616793 dfb.de
46716760 exclaim.ca
46816738 boston.com
46916698 familysearch.org
47016690 oregonlive.com
47116633 urn.fi
47216593 gematsu.com
47316567 news18.com
47416534 milb.com
47516520 japantimes.co.jp
47616447 msstate.edu
47716400 france24.com
47816349 thefreelibrary.com
47916296 stereogum.com
48016264 lpsn.dsmz.de
48116184 complex.com
48216128 bostonglobe.com
48316115 business-standard.com
48415900 moma.org
48515873 kirkusreviews.com
48615683 ctvnews.ca
48715675 scribd.com
48815488 slate.com
48915452 scotsman.com
49015449 visionofbritain.org.uk
49115444 mirror.co.uk
49215427 statbunker.com
49315410 nj.gov
49415367 arbitron.com
49515362 vimeo.com
49615298 allafrica.com
49715254 structurae.net
49815242 oscars.org
49915091 sherdog.com
50015087 cdc.gov
50115034 bleacherreport.com
50215015 collectionscanada.gc.ca
50314986 hindu.com
50414986 screendaily.com
50514974 thewrap.com
50614945 equibase.com
50714943 linkedin.com
50814924 ec.europa.eu
50914900 startribune.com
51014892 tshaonline.org
51114850 90minut.pl
51214845 top40.nl
51314823 elsevier.com
51414685 news24.com
51514677 telegraphindia.com
51614606 datazone.birdlife.org
51714575 globalnews.ca
51814567 consequence.net
51914476 ssb.no
52014427 theaustralian.com.au
52114411 myspace.com
52214393 circlechart.kr
52314390 gale.com
52414377 librarything.com
52514332 isfdb.org
52614320 newsinfo.inquirer.net
52714288 tvline.com
52814124 sarugby.co.za
52914027 mechon-mamre.org
53014009 teara.govt.nz
53114008 mlive.com
53213971 lamtakam.com
53313970 virginiadot.org
53413959 prowrestling.net
53513938 metro.co.uk
53613913 dnaindia.com
53713890 pqarchiver.com
53813847 walesonline.co.uk
53913839 isuresults.com
54013794 pressreader.com
54113788 findarticles.com
54213763 tribuneindia.com
54313735 encyclopedia.com
54413734 chron.com
54513726 bible.oremus.org
54613725 itv.com
54713704 wnba.com
54813686 cafonline.com
54913679 dropbox.com
55013648 msn.com
55113634 genecards.org
55213610 tcm.com
55313564 europarl.europa.eu
55413552 taicol.tw
55513532 wa.gov.au
55613528 rhs.org.uk
55713508 thenews.com.pk
55813465 peakbagger.com
55913464 spin.com
56013458 birdsoftheworld.org
56113424 ucr.edu
56213414 csu.gov.cz
56313402 musiccanada.com
56413361 nrl.com
56513294 autosport.com
56613244 webcitation.org
56713243 altpress.com
56813209 hrw.org
56913200 daviscup.com
57013194 boe.es
57113173 heraldsun.com.au
57213164 protennislive.com
57313136 skyscrapercenter.com
57413125 mindat.org
57513049 usnews.com
57613022 pcgamer.com
57712972 legifrance.gouv.fr
57812956 monitorlatino.com
57912922 politicalgraveyard.com
58012916 amazon.co.jp
58112915 omabrowser.org
58212868 snepmusique.com
58312839 svenskfotboll.se
58412829 vanityfair.com
58512820 dailyrecord.co.uk
58612736 efl.com
58712728 fimi.it
58812691 hiphopdx.com
58912684 huffpost.com
59012664 adatbank.mlsz.hu
59112662 ifpi.fi
59212637 state.nj.us
59312637 comicbook.com
59412612 cfl.ca
59512596 sec.gov
59612586 sky.com
59712585 pastemagazine.com
59812566 deccanherald.com
59912553 businesswire.com
60012547 fangraphs.com
60112528 ethnologue.com
60212473 f4wonline.com
60312471 uci.org
60412468 arstechnica.com
60512460 statscrew.com
60612440 austlii.edu.au
60712438 ala.org.au
60812434 deccanchronicle.com
60912433 rogerebert.com
61012413 musee-orsay.fr
61112408 ligue1.com
61212376 runeberg.org
61312366 radioinsight.com
61412341 tophit.com
61512331 kotaku.com
61612322 gamesradar.com
61712311 gcatholic.org
61812296 insidethegames.biz
61912273 lambiek.net
62012266 tampabay.com
62112228 firstpost.com
62212219 destatis.de
62312208 slagerlistak.hu
62412199 eci.gov.in
62512195 vatican.va
62612108 reviewjournal.com
62712086 nbl.com.au
62812058 avibase.bsc-eoc.org
62912050 sagepub.com
63012043 post-gazette.com
63112026 speciesplus.net
63212023 channelnewsasia.com
63311961 cleveland.com
63411930 nomisweb.co.uk
63511921 informatics.jax.org
63611910 cyclingarchives.com
63711884 orlandosentinel.com
63811857 bailii.org
63911848 iol.co.za
64011847 hoganstand.com
64111818 echa.europa.eu
64211802 mmafighting.com
64311773 rferl.org
64411752 metmuseum.org
64511751 thedailybeast.com
64611746 foxsports.com.au
64711658 globiz.pyraloidea.org
64811644 denverpost.com
64911613 boxingscene.com
65011608 ibiblio.org
65111605 norwegiancharts.com
65211563 ebird.org
65311550 cia.gov
65411522 abcmedianet.com
65511518 sefaria.org
65611497 dailytelegraph.com.au
65711493 librivox.org
65811487 eiga.com
65911455 hotnewhiphop.com
66011425 statistics.gov.uk
66111422 koreatimes.co.kr
66211411 elmundo.es
66311382 nationalrail.co.uk
66411378 euroleague.net
66511365 manchestereveningnews.co.uk
66611360 sverigetopplistan.se
66711350 engadget.com
66811330 resolver.kb.nl
66911309 belfasttelegraph.co.uk
67011274 aeroroutes.com
67111273 artuk.org
67211263 army.mil
67311236 who.int
67411213 iranicaonline.org
67511201 worldrowing.com
67611173 behindthevoiceactors.com
67711162 vanguardngr.com
67811161 tsn.ca
67911160 bucknell.edu
68011131 livemint.com
68111103 robertchristgau.com
68211102 sportsnet.ca
68311098 formula1.com
68411083 voanews.com
68511076 lequipe.fr
68611051 goodreads.com
68711026 ine.es
68811026 cinemaexpress.com
68910976 glottolog.org
69010933 hancinema.net
69110881 sify.com
69210870 dblp.org
69310859 navsource.net
69410829 enzyme-database.org
69510814 geonames.org
69610779 pic.nypl.org
69710768 newadvent.org
69810763 reliefweb.int
69910743 persee.fr
70010739 biblegateway.com
70110721 nymag.com
70210718 thecanadianencyclopedia.ca
70310652 infobae.com
70410651 mb.com.ph
70510633 icc-cricket.com
70610629 laliga.com
70710619 gbrathletics.com
70810590 jayski.com
70910585 concacaf.com
71010570 mercurynews.com
71110565 motorsport.com
71210542 marinetraffic.com
71310495 seqco.de
71410480 globalsportsarchive.com
71510464 eur-lex.europa.eu
71610446 freep.com
71710425 gameinformer.com
71810417 ajc.com
71910412 cwgc.org
72010412 eonline.com
72110410 olympic.org
72210262 justice.gov
72310248 nationalpost.com
72410205 al.com
72510167 the42.ie
72610161 mg.co.za
72710156 fide.com
72810141 oursportscentral.com
72910111 postalhistory.com
73010109 pmc.gov.au
73110101 espacenet.com
73210088 joins.com
73310078 timesonline.co.uk
73410066 environment.gov.au
73510015 spin.ph
73610012 cnet.com
73710011 fotball.no
73810011 punknews.org
73910006 azcentral.com
74010002 nascar.com
74110000 globalsecurity.org
7429977 hist.uzh.ch
7439976 dallasnews.com
7449967 medium.com
7459954 marcinwrochna.github.io
7469929 yle.fi
7479919 cricinfo.com
7489912 skyscraperpage.com
7499902 ghanaweb.com
7509898 nrk.no
7519865 espn.co.uk
7529857 bacdive.dsmz.de
7539856 mapress.com
7549855 loudwire.com
7559851 issn.org
7569850 ub.edu
7579839 westerncriminology.org
7589830 uchicago.edu
7599819 amazonaws.com
7609815 science.org
7619803 fightful.com
7629795 soundcloud.com
7639794 timeout.com
7649793 merriam-webster.com
7659765 deseret.com
7669758 nationalarchives.gov.uk
7679707 comics.org
7689688 the-numbers.com
7699688 wlu.edu
7709677 britishlistedbuildings.co.uk
7719664 sbs.com.au
7729662 bbfc.co.uk
7739629 ssrn.com
7749618 ifpicr.cz
7759614 j-league.or.jp
7769592 loudersound.com
7779582 247sports.com
7789577 breednet.com.au
7799573 lanacion.com.ar
7809565 theconversation.com
7819549 radioscope.co.nz
7829534 vox.com
7839533 conmebol.com
7849505 sundaytimes.lk
7859505 fcb-archiv.ch
7869498 researcharchive.calacademy.org
7879491 press.vatican.va
7889465 fortune.com
7899463 newcastlefans.com
7909457 bfi.org.uk
7919453 nintendolife.com
7929451 ynetnews.com
7939445 repubblica.it
7949415 miamiherald.com
7959409 csmonitor.com
7969352 fussballdaten.de
7979315 seamheads.com
7989290 vdb.czso.cz
7999281 naturalengland.org.uk
8009273 npg.org.uk
8019268 london2012.com
8029265 cadwpublic-api.azurewebsites.net
8039233 salon.com
8049226 datacube.statistics.sk
8059224 bloody-disgusting.com
8069219 thenationalnews.com
8079217 scitanie.sk
8089185 sina.com.cn
8099156 rncan.gc.ca
8109152 smithsonianmag.com
8119137 google.co.uk
8129111 tepapa.govt.nz
8139094 rcdb.com
8149087 genius.com
8159087 fbref.com
8169075 sandiegouniontribune.com
8179065 nola.com
8189060 biografischportaal.nl
8199043 virginia.gov
8209024 hitparade.ch
8219016 curlingzone.com
8229008 awm.gov.au
8239008 lasvegassun.com
8248997 irishcharts.ie
8258986 thefa.com
8268986 eurohockey.com
8278970 pdc.tv
8288962 clashmusic.com
8298958 uslchampionship.com
8308952 bangkokpost.com
8318899 gks.ru
8328894 icd9data.com
8338886 slantmagazine.com
8348886 simbad.u-strasbg.fr
8358868 nbcsports.com
8368866 ftp.funet.fi
8378862 oxfordreference.com
8388859 atpworldtour.com
8398856 ozfootball.net
8408855 trove.scot
8418847 fao.org
8428844 fedcup.com
8438804 worldaquatics.com
8448785 ussoccer.com
8458780 marxists.org
8468760 gulfnews.com
8478747 elections.ca
8488743 indiarailinfo.com
8498718 wisconsinhistory.org
8508691 legislink.org
8518664 allaboutjazz.com
8528658 patch.com
8538650 crunchyroll.com
8548639 india.com
8558637 sportscar365.com
8568625 venturebeat.com
8578614 euroleaguebasketball.net
8588585 uselectionatlas.org
8598564 biographi.ca
8608563 synchronkartei.de
8618507 senate.gov
8628506 chessgames.com
8638435 calflora.org
8648429 profootballarchives.com
8658422 uol.com.br
8668418 aa.com.tr
8678404 thejakartapost.com
8688368 siliconera.com
8698368 rio2016.com
8708344 politico.eu
8718343 footballfacts.ru
8728339 thefader.com
8738336 aftonbladet.se
8748332 parlement.com
8758332 denofgeek.com
8768311 timarit.is
8778303 wikiwix.com
8788280 sabr.org
8798261 govtrack.us
8808260 demoscope.ru
8818258 rc.majlis.ir
8828244 house.gov
8838233 jta.org
8848232 amphibiaweb.org
8858231 istat.it
8868219 stanford.edu
8878218 cagesidepress.com
8888214 chinadaily.com.cn
8898212 parliament.uk
8908203 wrestleview.com
8918141 minorplanet.info
8928135 tournamentsoftware.com
8938113 sportingnews.com
8948105 koreaherald.com
8958089 stltoday.com
8968077 judoinside.com
8978076 lefigaro.fr
8988047 barryhugmansfootballers.com
8998030 marvel.com
9008017 eurohandball.com
9018014 state.tx.us
9028006 amphibiansoftheworld.amnh.org
9038002 newsday.com
9048001 maryland.gov
9057998 coludata.co.uk
9067994 cabi.org
9077994 gov.bc.ca
9087988 dhakatribune.com
9097935 cam.ac.uk
9107926 gymnastics.sport
9117913 space.com
9127902 nst.com.my
9137896 questia.com
9147868 express.co.uk
9157867 nemzetisport.hu
9167857 baseballamerica.com
9177854 foxsports.com
9187853 ihf.info
9197851 vogue.com
9207844 britishmuseum.org
9217842 nikkansports.com
9227837 channel4.com
9237835 rfc-editor.org

Change #1251274 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Instrument clicks on external links to selected domains

https://gerrit.wikimedia.org/r/1251274

Change #1253566 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[operations/mediawiki-config@master] Configure external link tracking on 12 wikis (167 ext. domains)

https://gerrit.wikimedia.org/r/1253566

Change #1253572 had a related patch set uploaded (by Kosta Harlan; author: Mszwarc):

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.19] Instrument clicks on external links to selected domains

https://gerrit.wikimedia.org/r/1253572

Change #1253573 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/WikimediaEvents@master] externalLinks: Add QUnit tests

https://gerrit.wikimedia.org/r/1253573

Change #1253577 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/WikimediaEvents@master] externalLinks: Add QUnit tests

https://gerrit.wikimedia.org/r/1253577

Change #1253577 abandoned by Kosta Harlan:

[mediawiki/extensions/WikimediaEvents@master] externalLinks: Add QUnit tests

https://gerrit.wikimedia.org/r/1253577

And also most common external domains on TOP10 Wikipedias:

138483987 web.archive.org
210629013 toolforge.org
310358923 google.com
45775513 doi.org
54316863 viaf.org
64159235 nih.gov
73237218 loc.gov
83055926 bnf.fr
92876067 d-nb.info
102624196 wmflabs.org
112458986 archive.org
122036859 isni.org
131781293 worldcat.org
141739642 youtube.com
151667251 insee.fr
161660038 minorplanetcenter.net
171629370 nasa.gov
181557026 search.worldcat.org
191526645 nli.org.il
201440656 imdb.com
211344222 openstreetmap.org
221261284 jstor.org
231173470 google.co.jp
241168801 idref.fr
251140328 gbif.org
261126955 nytimes.com
271099628 bbc.co.uk
281091435 gouv.fr
291065371 harvard.edu
301028198 ndl.go.jp
31955501 musicbrainz.org
32898194 newspapers.com
33897497 billboard.com
34870274 nii.ac.jp
35837345 api.semanticscholar.org
36827436 aleph.nkp.cz
37775653 id.oclc.org
38759839 explore.gnd.network
39747887 catalogueoflife.org
40728009 prometheus.lmu.de
41725873 swb.bsz-bw.de
42723529 lobid.org
43693599 yandex.ru
44688063 census.gov
45680168 data.bibliotheken.nl
46677887 eol.org
47677655 inaturalist.org
48656866 theguardian.com
49649659 discogs.com
50648325 allmusic.com
51634203 nla.gov.au
52632939 yale.edu
53608931 britannica.com
54606869 webcitation.org
55588294 archive.today
56583977 itis.gov
57579737 deutsche-biographie.de
58577976 datos.bne.es
59565491 irmng.org
60526744 gov.pl
61524402 tree.opentreeoflife.org
62511500 soccerway.com
63510449 portal.issn.org
64466788 iucnredlist.org
65464555 bn.org.pl
66463828 uefa.com
67459994 olympedia.org
68442713 opac.sbn.it
69423810 nps.gov
70419476 id.worldcat.org
71418497 amigo.geneontology.org
72418417 facebook.com
73415715 x.com
74407228 openlibrary.org
75404986 bbc.com
76400897 twitter.com
77398712 authority.bibsys.no
78390817 nationalmap.gov
79387777 google.de
80384883 treccani.it
81365308 marinespecies.org
82349607 ipni.org
83337476 allmovie.com
84336877 classinform.ru
85332345 iabotmemento.invalid
86327242 nukat.edu.pl
87326048 biolib.cz
88317583 instagram.com
89316892 variety.com
90316834 espncricinfo.com
91313746 deadline.com
92308439 rottentomatoes.com
93302303 europa.eu
94301179 espn.com
95300055 tropicos.org
96299796 sports-reference.com
97295492 jst.go.jp
98294460 olympics.com
99288891 archive.is
100288217 arxiv.org
101285096 jpsearch.go.jp
102284877 dlib.jp
103280973 baseball-reference.com
104280331 animenewsnetwork.com
105278812 globo.com
106278730 natalie.mu
107274483 nba.com
108272576 national-football-teams.com
109266587 wikiwix.com
110265058 snaccooperative.org
111262238 science.kew.org
112260366 getty.edu
113259733 officialcharts.com
114258352 washingtonpost.com
115258099 deutsche-digitale-bibliothek.de
116253777 reuters.com
117252792 google.fr
118252046 capes.gov.br
119250249 historicengland.org.uk
120249087 latimes.com
121247295 metacritic.com
122246996 pro-football-reference.com
123242180 gks.ru
124240354 worldfootball.net
125236914 ine.es
126236894 bench.boldsystems.org
127229054 hollywoodreporter.com
128228149 biodiversitylibrary.org
129224861 allocine.fr
130222760 boxofficemojo.com
131219485 apple.com
132218629 footballdatabase.eu
133218031 basketball-reference.com
134217989 indiatimes.com
135214789 cnn.com
136211740 portal.dnb.de
137208806 lemonde.fr
138208505 telegraph.co.uk
139207892 hdl.handle.net
140207517 nhl.com
141206311 cantic.bnc.cat
142205353 oricon.co.jp
143200801 usgs.gov
144196364 pwn.pl
145195643 theplantlist.org
146193181 fifa.com
147188759 zdb-katalog.de
148187972 kopkatalogs.lv
149186817 weltfussball.de
150186687 independent.co.uk
151185513 transfermarkt.com
152183489 scorebar.com
153173897 worldathletics.org
154172888 thegazette.co.uk
155169129 efloras.org
156165310 zap2it.com
157165031 catholic-hierarchy.org
158164488 creativecommons.org
159163254 list.worldfloraonline.org
160162954 googleusercontent.com
161161383 researchgate.net
162161203 ebi.ac.uk
163159911 enciclopedia.cat
164158940 soccerbase.com
165156934 opac.kbr.be
166155768 rollingstone.com
167155481 libris.kb.se
168150608 animaldiversity.org
169149032 ars-grin.gov
170147731 google.it
171147131 transfermarkt.it
172143850 kicker.de
173142171 spiegel.de
174142101 mlb.com
175141761 biodiversity.org.au
176139990 gamespot.com
177139007 persee.fr
178136019 filmaffinity.com
179135916 findagrave.com
180135874 elpais.com
181135520 nl.go.kr
182135317 nhm.ac.uk
183133670 ultratop.be
184131655 academia.edu
185131096 simbad.u-strasbg.fr
186131048 uol.com.br
187129802 usatoday.com
188129425 rkd.nl
189128926 eu-football.info
190128585 ign.com
191125330 snl.no
192125035 hls-dhs-dss.ch
193125033 nfl.com
194124971 legislation.gov.uk
195124338 cbc.ca
196124123 procyclingstats.com
197124016 rsssf.org
198123876 ghostarchive.org
199123187 afi.com
200122229 forbes.com
201122118 naver.com
202121984 lequipe.fr
203121803 fis-ski.com
204121777 wikisky.org
205121671 abc.net.au
206120695 cancer.gov
207120366 citypopulation.de
208118809 fiba.basketball
209118506 denkmalpflege.sachsen.de
210117398 fbref.com
211116930 proquest.com
212116591 timetravel.mementoweb.org
213115916 themoviedb.org
214115754 itftennis.com
215115390 commonchemistry.cas.org
216115203 whc.unesco.org
217110274 geonames.org
218108438 opac.vatlib.it
219107105 lingvarium.org
220106933 thehindu.com
221105516 worldradiohistory.com
222105155 noaa.gov
223105127 yahoo.com
224104078 google.ru
225103074 icm.edu.pl
226102343 google.es
227102057 ew.com
228101630 nikkansports.com
229101533 filmweb.pl
230101216 apnews.com
231100921 rsssf.com
232100911 iaaf.org
233100808 denkmalatlas.niedersachsen.de
23499913 cricketarchive.com
23598635 transfermarkt.fr
23696272 sciencedirect.com
23795420 amazon.com
23894257 npr.org
23993103 filmportal.de
24092489 warheroes.ru
24191171 paleobiodb.org
24290674 rcsb.org
24390529 bloomberg.com
24490396 deepl.com
24590225 time.com
24689253 ofdb.de
24789207 data.europeana.eu
24888932 bdfutbol.com
24988693 wwe.com
25088681 pitchfork.com
25188628 csfd.cz
25288024 worldpostalcodes.org
25387980 katalog.nsk.hr
25487684 nme.com
25586722 lefigaro.fr
25685800 babel.hathitrust.org
25785796 universalis.fr
25884746 archive.ph
25983971 fivb.org
26083743 aljazeera.com
26183283 showbuzzdaily.com
26283018 oxforddnb.com
26382587 cassini.ehess.fr
26482485 atptour.com
26582293 smh.com.au
26681943 observation.org
26781259 offiziellecharts.de
26880736 eliteprospects.com
26980285 mathscinet.ams.org
27080141 wsj.com
27179960 congress.gov
27278854 gcatholic.org
27378784 avibase.bsc-eoc.org
27478115 sina.com.cn
27577898 amar.org.ir
27677812 ibdb.com
27777214 catalogue.nlg.gr
27876430 bing.com
27976392 gd.eppo.int
28076190 ameblo.jp
28175937 inpn.mnhn.fr
28275773 riaa.com
28375666 britishnewspaperarchive.co.uk
28475453 archives.gov
28575351 mtv.com
28674556 go.com
28773806 wiley.com
28873785 boe.es
28973711 ibge.gov.br
29073586 caltech.edu
29173007 rosstat.gov.ru
29272243 xeno-canto.org
29372208 fishbase.org
29472011 sapere.it
29571610 as.com
29671584 catalogo.bne.es
29771415 rada.gov.ua
29871289 gaonchart.co.kr
29971198 springer.com
30070922 nikkei.com
30170804 species.nbnatlas.org
30270634 iihf.com
30369954 stolaf.edu
30469514 fcc.gov
30569446 cbsnews.com
30669419 people.com
30769174 fishbase.ca
30869151 asahi.com
30968802 tufts.edu
31068646 daten.digitale-sammlungen.de
31168554 austriancharts.at
31268551 bnportugal.gov.pt
31368291 nature.com
31468237 obs.coe.int
31568212 mymovies.it
31668094 fauna-eu.org
31768001 chemspider.com
31867904 allcinema.net
31967819 aviation-safety.net
32067732 dutchcharts.nl
32167571 scopus.com
32267313 transfermarkt.de
32367270 sponichi.co.jp
32467026 gutenberg.org
32567025 wtatennis.com
32666598 wsc.nmbe.ch
32766292 inventaris.vioe.be
32866176 protectedplanet.net
32965970 github.com
33065909 meteofrance.com
33165606 uniprot.org
33265032 eu-nomen.eu
33364513 kotobank.jp
33464273 nhk.or.jp
33563830 mobygames.com
33663796 spider.seds.org
33763654 orcid.org
33862979 natlib.govt.nz
33962641 ne.se
34062535 portal.historicenvironment.scot
34162466 nbcnews.com
34262061 obis.org
34361683 spotify.com
34460859 irishtimes.com
34560749 sandre.eaufrance.fr
34660675 gesetze-im-internet.de
34760127 vatican.va
34859887 hockeydb.com
34959714 usda.gov
35059667 nzor.org.nz
35159407 kinopoisk.ru
35258831 thefutoncritic.com
35358829 oadoi.org
35458639 docs.cntd.ru
35558519 insecta.pro
35658366 munzinger.de
35757928 jreast.co.jp
35857654 genius.com
35957638 racingpost.com
36057529 indianexpress.com
36157354 ouest-france.fr
36257225 chicagotribune.com
36357203 legaseriea.it
36457119 goal.com
36557074 amazon.co.jp
36656314 e-stat.go.jp
36756109 vizier.u-strasbg.fr
36856101 realgm.com
36956064 mlssoccer.com
37055788 rism.online
37155733 explorer.natureserve.org
37255371 enciklopedija.hr
37355056 bpi.co.uk
37454860 isfdb.org
37554646 unicode.org
37654522 deadurl.invalid
37754467 stuff.co.nz
37854426 cagematch.net
37954315 flickr.com
38054279 nga.mil
38153914 hitparade.ch
38253774 mathgenealogy.org
38353706 iabotdeadurl.invalid
38453574 bazhum.muzhp.pl
38553251 dw.com
38653148 rcin.org.pl
38753114 leparisien.fr
38853029 huffingtonpost.com
38952178 tandfonline.com
39052140 90minut.pl
39152104 issuu.com
39251945 billboard-japan.com
39351850 transfermarkt.pl
39451730 kommersant.ru
39551677 ft.com
39651598 ebird.org
39751527 racing-reference.info
39851481 boxrec.com
39951452 google.com.br
40051419 geoportal.bayern.de
40151292 onb.ac.at
40251080 elmundo.es
40350954 blabbermouth.net
40450758 marca.com
40550654 press.vatican.va
40650420 cyclingnews.com
40750363 uboat.net
40850320 bugguide.net
40950315 skysports.com
41050150 consultant.ru
41150057 synchronkartei.de
41249780 demo.istat.it
41349692 the-afc.com
41449682 swisscharts.com
41549540 repubblica.it
41649527 infobae.com
41749495 nobelprize.org
41849337 protennislive.com
41949256 poczta-polska.pl
42049240 myspace.com
42148997 liberation.fr
42248827 eurobasket.com
42348682 zbmath.org
42448487 flightglobal.com
42548469 bfs.admin.ch
42648045 naco.org
42747996 nndb.com
42847852 vle.lt
42947720 un.org
43047701 justia.com
43147651 nta.go.jp
43247590 nzherald.co.nz
43347495 demoscope.ru
43447482 hindustantimes.com
43547428 olympic.org
43647406 zeit.de
43747353 last.fm
43847302 australian-charts.com
43947187 hist.uzh.ch
44047117 censusindia.gov.in
44147083 aria.com.au
44246839 denkxweb.denkmalpflege-hessen.de
44346777 goodreads.com
44446771 eiga.com
44546610 eurovision.tv
44646582 premierleague.com
44746573 datazone.birdlife.org
44846314 cambridge.org
44946288 thesaurus.cerl.org
45046285 independent.ie
45146097 economy.gov.ru
45245771 timesofisrael.com
45345770 wikitree.com
45445584 retrosheet.org
45545220 si.com
45644740 city-data.com
45744653 faz.net
45844487 indexfungorum.org
45944479 screenrant.com
46044443 mondefootball.fr
46144274 sankei.com
46244137 zoobank.org
46343902 cbssports.com
46443694 sueddeutsche.de
46543456 fr.distance.to
46643404 ethnologue.com
46743373 welt.de
46843217 bigenc.ru
46943023 rte.ie
47042969 politico.com
47142935 africanplantdatabase.ch
47242911 statcan.gc.ca
47342855 polygon.com
47442853 pbs.org
47542844 plus-legacy.cobiss.net
47642800 lanacion.com.ar
47742798 omim.org
47842598 footballfacts.ru
47942458 rateyourmusic.com
48042444 oup.com
48142437 genenames.org
48242263 stats.gov.cn
48342174 podvignaroda.ru
48442075 reptile-database.reptarium.cz
48541971 netkeiba.com
48641938 oscars.org
48741886 nlb.gov.sg
48841834 mca.gov.cn
48941797 acnpsearch.unibo.it
49041788 newyorker.com
49141777 bibleserver.com
49241697 playbill.com
49341674 mycobank.org
49441650 j-league.or.jp
49541628 intersportstats.com
49641346 fide.com
49741340 theverge.com
49841145 library.sh.cn
49941128 pop-stat.mashke.org
50041108 lescharts.com
50141029 france24.com
50240972 eurogamer.net
50340965 mainichi.jp
50440809 cairn.info
50540181 swedishcharts.com
50640174 pwtorch.com
50739884 sfgate.com
50839873 plus.cobiss.net
50939628 haaretz.com
51039557 inegi.org.mx
51139366 dbe.rah.es
51239357 research.amnh.org
51339221 bucknell.edu
51439164 fimi.it
51539081 kinenote.com
51638967 fussballdaten.de
51738944 eurohandball.com
51838902 grammy.com
51938813 journals.openedition.org
52038622 wa.gov.au
52138438 mindat.org
52238395 dialnet.unirioja.es
52338362 ltn.com.tw
52438305 isuresults.com
52538240 indiatoday.in
52638226 epa.gov
52738213 ria.ru
52838155 donneespubliques.meteofrance.fr
52938126 theglobeandmail.com
53038067 indiewire.com
53138046 prtimes.jp
53237961 elsevier.com
53337928 destatis.de
53437841 collider.com
53537827 fda.gov
53637789 bgee.org
53737778 soumu.go.jp
53837724 kodansha.co.jp
53937686 wbc.poznan.pl
54037494 wired.com
54137491 treatment.plazi.org
54237477 kib.ac.cn
54337458 worldbirdnames.org
54437333 tournamentsoftware.com
54537272 jpost.com
54637232 google.co.uk
54737157 tass.ru
54837103 spacedys.com
54937025 psa.gov.ph
55036961 biblegateway.com
55136941 id.sbn.it
55236871 digitalspy.com
55336824 britishmuseum.org
55436795 xinhuanet.com
55536791 cyberleninka.ru
55636735 businessinsider.com
55736570 paralympic.org
55836476 elib.shpl.ru
55936462 icd.who.int
56036225 soundcloud.com
56136124 dailymail.co.uk
56235900 worldrowing.com
56335847 mlit.go.jp
56435828 thetimes.com
56535747 viewer.rusneb.ru
56635722 musee-orsay.fr
56735562 lenta.ru
56835423 nbn-resolving.de
56935374 scmp.com
57035339 atpworldtour.com
57135217 ed.gov
57235198 icd9data.com
57335159 performing-arts.eu
57435065 muziekweb.nl
57535044 nydailynews.com
57634913 imslp.org
57734911 judoinside.com
57834546 lubw.baden-wuerttemberg.de
57934510 foxnews.com
58034433 birdsoftheworld.org
58134326 sinica.edu.tw
58234242 avclub.com
58334225 datacube.statistics.sk
58434180 speciesplus.net
58534124 urn.bn.pt
58634093 kadokawa.co.jp
58734060 bdfa.com.ar
58834045 historicplaces.ca
58934014 british-history.ac.uk
59033950 the-sports.org
59133948 bizjournals.com
59233882 tvguide.com
59333876 brockhaus.de
59433862 api.parliament.uk
59533794 dfb.de
59633713 taicol.tw
59733688 verum.icu
59833507 kulturnav.org
59933474 qq.com
60033408 philstar.com
60133369 sherdog.com
60233174 cnbc.com
60333112 ihf.info
60433053 metal-archives.com
60533052 sanspo.com
60632988 tamu.edu
60732958 worldaquatics.com
60832899 yahoo.co.jp
60932750 remonterletemps.ign.fr
61032636 omegatiming.com
61132609 gov.uk
61232597 formula1.com
61332512 fallingrain.com
61432511 npg.org.uk
61532455 tff.org
61632429 ucsc.edu
61732351 firenze.sbn.it
61832304 jbis.or.jp
61932173 antoniogenna.net
62032156 hk01.com
62132112 tvtonight.com.au
62232101 newadvent.org
62332070 polona.pl
62432055 snepmusique.com
62531975 thestar.com
62631956 prowrestling.net
62731788 amphibiaweb.org
62831708 denstoredanske.lex.dk
62931703 vice.com
63031660 c-span.org
63131657 cassi.cas.org
63231528 baseball-almanac.com
63331464 assemblee-nationale.fr
63431353 cseligman.com
63531314 abs.gov.au
63631220 thedailystar.net
63731113 sohu.com
63831111 yomiuri.co.jp
63931042 czso.cz
64030976 istat.it
64130964 bn.gov.br
64230872 charts.nz
64330862 statistik.at
64430849 degruyter.com
64530824 tcdb.com
64630662 thecanadianencyclopedia.ca
64730651 ricerca.repubblica.it
64830627 ensembl.org
64930609 rfi.fr
65030546 urn.fi
65130536 proballers.com
65230527 shueisha.co.jp
65330461 legacy.com
65430392 speciesfungorum.org
65530319 dawn.com
65630309 straitstimes.com
65730291 psu.edu
65830178 soompi.com
65930106 techcrunch.com
66030045 nic.funet.fi
66129979 nsw.gov.au
66229977 sports.ru
66329805 cia.gov
66429774 tvline.com
66529753 digitale-sammlungen.de
66629536 bac-lac.gc.ca
66729502 guardian.co.uk
66829345 cbr.com
66929289 structurae.net
67029240 lesechos.fr
67129225 newindianexpress.com
67229160 lnb.libis.lt
67329143 economist.com
67429062 vulture.com
67528973 cnc.fr
67628938 digitalspy.co.uk
67728815 radiofrance.fr
67828759 musiccanada.com
67928758 who.int
68028597 gazzetta.it
68128461 mmajunkie.com
68228416 chessgames.com
68328336 cinematheque.qc.ca
68428297 moma.org
68528282 cornell.edu
68628233 ucsb.edu
68728224 weibo.com
68828124 globalsecurity.org
68928038 afltables.com
69027943 post.japanpost.jp
69127865 cna.com.tw
69227844 google.pl
69327681 cafonline.com
69427644 trackfield.brinkster.net
69527580 vimeo.com
69627544 uci.org
69727506 tagesspiegel.de
69827472 sealifebase.ca
69927396 lepoint.fr
70027365 rg.ru
70127293 theage.com.au
70227272 lavanguardia.com
70327215 daviscup.com
70426993 abc.es
70526958 dfi.dk
70626937 unifrance.org
70726924 australianfootball.com
70826921 playmakerstats.com
70926896 reader.digitale-sammlungen.de
71026856 tokyo.lg.jp
71126802 euroleaguebasketball.net
71226785 umd.edu
71326659 theatlantic.com
71426615 popmatters.com
71526589 filmdienst.de
71626521 taz.de
71726497 udn.com
71826481 anu.edu.au
71926390 worldfloraonline.org
72026331 kalliope-verbund.info
72126330 bncatalogo.cl
72226329 conmebol.com
72326273 mlbtraderumors.com
72426255 circlechart.kr
72526180 mlol.link
72626149 thepeerage.com
72726076 programminginsider.com
72826032 interno.gov.it
72925970 abs-cbn.com
73025916 comicbook.com
73125792 eci.nic.in
73225772 emmys.com
73325707 gmanetwork.com
73425638 lexpress.fr
73525615 upi.com
73625570 thewrap.com
73725495 thehill.com
73825492 fossilworks.org
73925468 highbeam.com
74025423 nypost.com
74125413 euroleague.net
74225398 hochi.news
74325333 eurohockey.com
74425225 vk.com
74525185 championat.com
74625046 hockey-reference.com
74725044 newspaperarchive.com
74825038 gsi.go.jp
74924992 behindthevoiceactors.com
75024989 msn.com
75124900 norwegiancharts.com
75224900 archiwum.nauka-polska.pl
75324868 japantimes.co.jp
75424745 ladepeche.fr
75524711 govinfo.gov
75624701 uj.edu.pl
75724671 rugbyleagueproject.org
75824537 11v11.com
75924524 plus.cobiss.si
76024514 amphibiansoftheworld.amnh.org
76124473 bioinfo.cn
76224459 newsbank.com
76324427 derstandard.at
76424406 milb.com
76524265 cinematoday.jp
76624235 gamesradar.com
76724218 iranicaonline.org
76824099 famitsu.com
76924060 stat.gov.rs
77024058 etomesto.ru
77124056 courtlistener.com
77224048 archive.ensembl.org
77324025 uchicago.edu
77423974 fishbase.se
77523968 complex.com
77623955 cyclingarchives.com
77723877 laliga.com
77823874 zgbk.com
77923820 lfp.fr
78023730 wnba.com
78123713 filmpolski.pl
78223616 pcgamer.com
78323534 linkedin.com
78423477 lagis-hessen.de
78523457 tms.fih.ch
78623450 the-numbers.com
78723438 top40.nl
78823427 cqranking.com
78923362 letterboxd.com
79023347 esu.com.ua
79123220 kotaku.com
79223188 slagerlistak.hu
79323147 ifpicr.cz
79423120 rnz.co.nz
79523076 irishexaminer.com
79623030 linternaute.com
79723025 firstcycling.com
79822999 gematsu.com
79922956 clarin.com
80022919 olimpbase.org
80122886 polskawliczbach.pl
80222880 siteducyclisme.com
80322876 cdc.gov
80422802 americanradiohistory.com
80522776 svenskfotboll.se
80622707 motogp.com
80722605 ordnancesurvey.co.uk
80822571 people.com.cn
80922525 fangraphs.com
81022497 navy.mil
81122496 n2yo.com
81222485 gbrathletics.com
81322479 fernsehserien.de
81422479 mundodeportivo.com
81522461 163.com
81622451 ballotpedia.org
81722447 cervantesvirtual.com
81822423 terra.com.br
81922371 francetvinfo.fr
82022340 ourcampaigns.com
82122333 deces.matchid.io
82222328 biografischportaal.nl
82322274 newsweek.com
82422272 pravo.gov.ru
82522225 scribd.com
82622219 broadwayworld.com
82722143 chinatimes.com
82822141 news.com.au
82922118 nzz.ch
83022110 hancinema.net
83122104 mdpr.jp
83222042 4gamer.net
83321954 dyntaxa.se
83421947 encyclopedia.com
83521940 whocc.no
83621884 bfi.org.uk
83721882 glottolog.org
83821862 historyofparliamentonline.org
83921855 publishersweekly.com
84021784 films.bifi.fr
84121737 amazon.co.uk
84221732 baltimoresun.com
84321695 sverigetopplistan.se
84421653 aotearoamusiccharts.co.nz
84521632 mapdata.ru
84621611 fmg.ac
84721605 bleacherreport.com
84821590 leballonrond.fr
84921536 nj.com
85021520 impress.co.jp
85121503 peakbagger.com
85221501 larousse.fr
85321480 mobot.org
85421438 douban.com
85521420 rappler.com
85621376 prnewswire.com
85721333 osmaps.com
85821319 recensement.insee.fr
85921313 std.gmcrosstata.ru
86021272 tcm.com
86121242 radiotimes.com
86221213 pamyat-naroda.ru
86321135 motorsport.com
86421097 arstechnica.com
86521054 zipdatamaps.com
86620963 standard.co.uk
86720961 tbs.co.jp
86820941 daily.co.jp
86920806 laparola.net
87020760 biographien.ac.at
87120692 familysearch.org
87220646 thestar.com.my
87320588 ams.org
87420552 ukwhoswho.com
87520458 kremlin.ru
87620440 runeberg.org
87720352 biogps.org
87820326 statsf1.com
87920231 kstyle.com
88020215 e-newspaperarchives.ch
88120173 slate.com
88220155 hrw.org
88320125 data.nlg.gr
88420118 rogerebert.com
88520115 mantan-web.jp
88620110 ici.radio-canada.ca
88720104 wikimapia.org
88820021 tuik.gov.tr
88919995 spin.com
89019976 songkick.com
89119944 fao.org
89219873 e-icisleri.gov.tr
89319869 zenodo.org
89419866 sport-express.ru
89519823 search.rsl.ru
89619811 ifpi.fi
89719801 lpsn.dsmz.de
89819732 archiviolastampa.it
89919628 books.google.be
90019570 metro.tokyo.jp
90119511 mirror.co.uk
90219500 concacaf.com
90319477 actu.fr
90419470 thetv.jp
90519436 france3-regions.francetvinfo.fr
90619397 nodak.edu
90719362 denkmaldatenbank.berlin.de
90819344 seattletimes.com
90919328 altomfotball.no
91019327 whatsmat.uww.org
91119287 rbc.ru
91219264 metmuseum.org
91319252 realsound.jp
91419204 cinematografo.it
91519177 estadao.com.br
91619157 animeclick.it
91719155 rtve.es
91819152 syr.edu
91919127 gameinformer.com
92019112 ns.gis-bldam-brandenburg.de
92119090 skyscrapercenter.com
92219089 afl.com.au
92319082 stanford.edu
92419075 barks.jp
92519061 artuk.org
92619058 rtbf.be
92718947 afpbb.com
92818936 gouv.qc.ca
92918916 footballfakts.ru
93018897 archives.parliament.uk
93118896 thepaper.cn
93218858 vanityfair.com
93318836 tribune.com.pk
93418772 lesarchivesduspectacle.net
93518733 collectionscanada.gc.ca
93618679 autosport.com
93718678 afromoths.net
93818652 jhu.edu
93918648 20minutes.fr
94018636 fotball.no
94118601 ksh.hu
94218546 mrqe.com
94318538 cyclebase.nl
94418487 datencenter.dfb.de
94518481 espacenet.com
94618479 tepapa.govt.nz
94718434 wrestleview.com
94818430 tagesschau.de
94918382 news.mynavi.jp
95018312 sudouest.fr
95118293 allroutes.ru
95218225 tshaonline.org
95318196 screendaily.com
95418177 pep.ph
95518175 mmafighting.com
95618169 npb.jp
95718169 memoria.bn.br
95818107 thedraftreview.com
95918067 worldstatesmen.org
96018034 elibrary.ru
96118023 ndtv.com
96218005 f4wonline.com
96317971 bollywoodhungama.com
96417918 boston.com
96517789 usra.edu
96617759 eltiempo.com
96717752 shogi.or.jp
96817679 dropbox.com
96917679 nouvelobs.com
97017679 tv-tokyo.co.jp
97117641 catalog.hathitrust.org
97217627 heise.de
97317612 itmedia.co.jp
97417560 stereogum.com
97517555 allafrica.com
97617536 taipeitimes.com
97717529 francebleu.fr
97817488 rediff.com
97917481 postalhistory.com
98017448 verwaltungsdaten-informationsplattform.de
98117437 marvel.com
98217429 visionofbritain.org.uk
98317400 abendblatt.de
98417314 davidemaggio.it
98517306 corriere.it
98617292 heraldscotland.com
98717286 pagina12.com.ar
98817274 ibiblio.org
98917228 hkpl.gov.hk
99017213 fightful.com
99117212 tv-asahi.co.jp
99217206 findarticles.com
99317197 abcmedianet.com
99417179 ansa.it
99517132 espn.co.uk
99617129 zeno.org
99717102 metro.co.uk
99817090 iopscience.iop.org
99917090 lavoixdunord.fr
100017073 gazeta.ru

Change #1253573 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] externalLinks: Add QUnit tests

https://gerrit.wikimedia.org/r/1253573

Change #1253572 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.19] Instrument clicks on external links to selected domains

https://gerrit.wikimedia.org/r/1253572

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:10:26Z] <catrope@deploy2002> Started scap sync-world: Backport for [[gerrit:1248665|Enable passwordless login in production (T419198)]], [[gerrit:1253572|Instrument clicks on external links to selected domains (T419837)]]

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:12:16Z] <catrope@deploy2002> kharlan, catrope: Backport for [[gerrit:1248665|Enable passwordless login in production (T419198)]], [[gerrit:1253572|Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:17:09Z] <catrope@deploy2002> Finished scap sync-world: Backport for [[gerrit:1248665|Enable passwordless login in production (T419198)]], [[gerrit:1253572|Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)

Change #1253566 merged by jenkins-bot:

[operations/mediawiki-config@master] Configure external link aggregate usage on 12 wikis for top domains

https://gerrit.wikimedia.org/r/1253566

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:37:05Z] <kharlan@deploy2002> Started scap sync-world: Backport for [[gerrit:1253566|Configure external link aggregate usage on 12 wikis for top domains (T419837)]]

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:38:55Z] <kharlan@deploy2002> kharlan, mszwarc: Backport for [[gerrit:1253566|Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-03-16T20:44:04Z] <kharlan@deploy2002> Finished scap sync-world: Backport for [[gerrit:1253566|Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)

Change #1254875 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[mediawiki/extensions/WikimediaEvents@master] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254875

Change #1254876 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[operations/mediawiki-config@master] Tweak configuration of external link aggregate usage analysis

https://gerrit.wikimedia.org/r/1254876

Change #1254916 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.20] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254916

Change #1254917 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.19] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254917

Change #1254875 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254875

Change #1254916 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.20] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254916

Change #1254917 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@wmf/1.46.0-wmf.19] Normalize external domain names in click analysis

https://gerrit.wikimedia.org/r/1254917

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:41:42Z] <mszwarc@deploy2002> Started scap sync-world: Backport for [[gerrit:1254916|Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917|Normalize external domain names in click analysis (T419837)]]

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:43:44Z] <mszwarc@deploy2002> mszwarc: Backport for [[gerrit:1254916|Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917|Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Change #1254876 merged by jenkins-bot:

[operations/mediawiki-config@master] Tweak configuration of external link aggregate usage analysis

https://gerrit.wikimedia.org/r/1254876

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:49:05Z] <mszwarc@deploy2002> Finished scap sync-world: Backport for [[gerrit:1254916|Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917|Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:49:44Z] <mszwarc@deploy2002> Started scap sync-world: Backport for [[gerrit:rEDOI12548760e11c|Tweak configuration of external link aggregate usage analysis (T419837)]]

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:51:52Z] <mszwarc@deploy2002> mszwarc: Backport for [[gerrit:rEDOI12548760e11c|Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-03-18T13:56:25Z] <mszwarc@deploy2002> Finished scap sync-world: Backport for [[gerrit:rEDOI12548760e11c|Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)

Hi everyone!

Following up on @FRomeo_WMF's suggestion, I’m very interested in this work and would like to explore how my PhD research on citation patterns might complement or leverage this new data-collection effort.

My work focuses on enriching our understanding of the top cited domains in different language Wikipedias. You can see interactive visualizations of the top 1000 domains for English, Spanish, French, and German here:

EN: https://silviaegt.github.io/Wikipedia-citations/treemap_top1000_filtered_en.html
ES: https://silviaegt.github.io/Wikipedia-citations/treemap_top1000_filtered_es.html
FR: https://silviaegt.github.io/Wikipedia-citations/treemap_top1000_filtered_fr.html
DE: https://silviaegt.github.io/Wikipedia-citations/treemap_top1000_filtered_de.html

A key part of my analysis involves adding a layer of information to these domains. I've been using the Media Bias Fact Check (MBFC) API to enrich these top domains with metadata about their political bias, factual reporting rating, and media type. This allows us to see, for example, not just that nytimes.com is frequently cited, but that a newspaper (media type) from the USA (geographical provenance) is among the top 10 most-referenced domains in multiple Wikipedias.

So far, I've only been able to attach metadata to a small fraction of the top domains. The coverage is very incomplete, and that's part of why I'm so excited about this collaboration. My hope is that by connecting with others who see the value in this approach—whether through shared datasets, different enrichment sources, or simply fresh perspectives—we can build something far better than what any of us could do alone. That's the wiki way I love: everyone brings a piece, and together we create something that's more than the sum of its parts.

I've documented some of this work on Meta: Research:Untangling Wikipedia's Sources: Mapping References To Reveal Global Knowledge

I'm hoping this approach could align with your work here in a few ways:

Enriching the Analyzed Domains: The task plans to instrument clicks on a selected list of domains (the top ~167), and I hope we could potentially add metadata to the analysis, allowing us to answer questions such as: Do readers click links from newspapers rather than from journals? Are they compelled to look more into sources from certain countries?

Providing a Baseline: My current treemaps show the presence of these domains in articles (citation frequency). The new instrumentation would measure the reader engagement with those same domains (click-through rate). Combining these two datasets could be powerful, revealing, for instance, if a domain is cited often but rarely clicked, or vice-versa.

Expanding the Language Scope: While the current task is wisely starting with a set of wikis, my work on the ES, FR, and DE Wikipedias could help inform the selection or prioritization of domains for future expansion, or provide a comparative framework for analyzing results across those language editions.

I see a great opportunity to collaborate. I would be happy to:

  • Share the enriched dataset I've built for the top domains across these languages.
  • Help brainstorm how MBFC or similar metadata could be integrated into the analysis of the collected clickstream data.
  • Explore whether my findings on domain popularity across languages might help refine the list of domains to be tracked.

Looking forward to hearing your thoughts and seeing how this valuable data takes shape.

Change #1266866 had a related patch set uploaded (by Mszwarc; author: Mszwarc):

[operations/mediawiki-config@master] Disable external link analysis

https://gerrit.wikimedia.org/r/1266866

Change #1266866 merged by jenkins-bot:

[operations/mediawiki-config@master] Disable external link analysis

https://gerrit.wikimedia.org/r/1266866

Mentioned in SAL (#wikimedia-operations) [2026-04-02T07:50:39Z] <mszwarc@deploy1003> Started scap sync-world: Backport for [[gerrit:1266866|Disable external link analysis (T419837)]]

Mentioned in SAL (#wikimedia-operations) [2026-04-02T07:52:40Z] <mszwarc@deploy1003> mszwarc: Backport for [[gerrit:1266866|Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-04-02T08:00:53Z] <mszwarc@deploy1003> Finished scap sync-world: Backport for [[gerrit:1266866|Disable external link analysis (T419837)]] (duration: 10m 13s)

I have disabled the measurements and exported clicks data from Grafana into a Google sheet: https://docs.google.com/spreadsheets/d/14F4Ar5Nt7ITpv7J30RnEnzQXYesfsvrdA6_kTbLIUlI/edit

The most common clicked-on domain on the instrumented wikis was archive.org (with 174k clicks per day on average). Other popular domains (>10k clicks per day) were:

  • imdb.com
  • toolforge.org
  • youtube.com
  • instagram.com
  • x.com
  • google.com
  • doi.org
  • nytimes.com
  • wmflabs.org
  • espncricinfo.com
  • fifa.com

Other website of potential interest, Archive Today (combined across domains), received 3.5k clicks per day.