I looked at a list of links on this page,
$ mech-dump --links ...miraheze... | sort -u > list.txt
and noticed a combination of absolute links and relative links.
Shouldn't all the absolute links that refer to the local wiki be instead relative links, for efficiency?
I.e.,
https://radioscanningtw.miraheze.org/w/index.php ... should be just
/w/index.php , no?
Description
Event Timeline
1st one says absolute may be better if using a CDN which both Wikimedia and Miraheze do but is talking about loading resources.
2nd one refers to the slight changes in page size + rendering time by your browser which aren't going to be noticeable when using href links.
(My guess is the authors of MediaWiki made the conscious decision to use relative links.
But some extension authors didn't know how to make them, so just used absolute links.
Just like we know when editing Wikipedia to use [[Football]] instead of [https://en.wiki..../Football Football].)
Unclear how to reproduce (and no idea about mech-dump's internals); where to find this in Wikimedia code?
https://github.com/libwww-perl/WWW-Mechanize/blob/master/script/mech-dump
Note
$ lynx -dump -listonly https://...
converts all links to absolute, so is useless for debugging this issue.
Indeed you can simply do:
$ GET https://... > /tmp/e
Then:
$ perl -nwle 'print for /href=............./g' /tmp/e|sort|uniq -c|sort -nr 54 href="https://radi 19 href="/wiki/%E7%89 19 href="/wiki/%E4%BD 11 href="javascript:v 10 href="/w/index.php 4 href="https://meta 4 href="https://crea 2 href="/wiki/%E9%A6 1 href=\"https://cre 1 href='https://radi 1 href="https://www. 1 href="/w/opensearc 1 href="/w/load.php? 1 href="/favicon.ico 1 href="/apple-touch 1 href="//login.mira 1 href="#searchInput 1 href="#mw-head"
So we observe 54 + 1 = 55 external links that should have been just local links.