Remove worse-than-useless <meta> keywords feature
Closed, ResolvedPublic

Description

The feature that adds meta keywords to the header of each page implementation of this feature is insufficient, and the very concept of selected keywords is ultimately flawed. It should be removed.

The idea of meta keywords was to provide search engines. From HTML 4.01 (http://www.w3.org/TR/REC-html40/struct/global.html#h-7.4.4.2):
A common use for META is to specify keywords that a search engine may use to improve the quality of search results.
(http://www.w3.org/TR/REC-html40/appendix/notes.html#recs):
"Some indexing engines look for META elements that define a comma-separated list of keywords/phrases ... Search engines may present these keywords as the result of a search."

So what's wrong? To start with, MediaWiki's implementation of this feature is worse than useless. If we are to believe it, the most relevant terms for Wikipedia's main page are:
"Main Page,1769,1945,1994,1999,2.5D,2004,2009,2D computer graphics,622,Aiea, Hawaii"

For the current featured article, they are:
"Domitian,Articles with unsourced statements from July 2009,Flavian Dynasty,The Triumph of Titus Alma Tadema.jpg,Special:Search/Domitian,69,79,81,96,Abdication,Abortion"

The theory is that these topics are important because they are either within the title, or linked in the article. This misses the point, as search engines _already look_ for text that is in the title or linked in the article.

As shown above, the algorithm fails spectacularly to select sensible keywords. These keywords would do more harm than good if they did anything. Fortunately, they do not. Modern search engines do not use these keywords because of their propensity for spam/SEO (see http://en.wikipedia.org/wiki/Keyword_stuffing). They simply cannot be relied on.

Google went out of its way to talk about every other meta tag, then denied using keywords when asked:
http://googlewebmastercentral.blogspot.com/2007/12/answering-more-popular-picks-meta-tags.html
(John Mueller: 'You're right in that we generally ignore the contents of the "keywords" meta tag.')

Unfortunately, the 80-100 gzipped bytes (250-350 uncompressed bytes) these keywords occupy per response still negatively impacts users. Coming as they do before other headers, the keywords delay the load of linked CSS and JS files, as well as the content itself, occupy cache space, and take CPU to generate.

Providing a way to specify appropriate keywords is not a solution, as it would distract attention from actually useful editing. The standard itself has been abandoned by those intended to use it. This feature should be removed altogether, or at the very least made optional and turned off by default.

(Ironically, the meta keyword that *is* sometimes used - "description" - is not supported by MediaWiki, probably because search engines do a good enough job at figuring out appropriate text without it.)


Version: unspecified
Severity: minor

bzimport added a project: MediaWiki-Parser.Via ConduitNov 21 2014, 10:39 PM
bzimport set Reference to bz19761.
GreenReaper created this task.Via LegacyJul 16 2009, 3:30 AM
GreenReaper added a comment.Via ConduitJul 16 2009, 3:32 AM

Whoops - please read the first paragraph to "The implementation of the feature that adds meta keywords to the header of each page is insufficient, and the very concept of selected keywords is ultimately flawed. It should be removed." :-)

demon added a comment.Via ConduitJul 16 2009, 2:34 PM

Cf bug 570 and bug 7614. Can probably be marked INVALID/WONTFIX if this bug is done.

brion added a comment.Via ConduitJul 19 2009, 5:00 PM

Could have sworn I removed this misfeature a year or two ago. :) Gettin' out the scissors...

brion added a comment.Via ConduitJul 19 2009, 5:07 PM

Done in r53482.

Note the functionality could be replicated in an extension easily enough should someone actually want such things. :) But modern search engines don't use the field, and the values were poorly selected, so I'm happy to kill it.

Jidanni added a comment.Via ConduitJul 20 2009, 11:02 PM

(In reply to comment #0)

Unfortunately, the 80-100 gzipped bytes (250-350 uncompressed bytes) these
keywords occupy per response still negatively impacts users. Coming as they do
before other headers, the keywords delay the load of linked CSS and JS files,
as well as the content itself, occupy cache space, and take CPU to generate.

How about accessibility bug 19453: it's as if your keyword bad dream
jumped out of the <head> and into a rendered nightmare the top of the <body>
for several screenfulls, doubling total page bytes too.

bzimport added a comment.Via ConduitDec 10 2010, 1:01 AM

pzimmerm wrote:

Aargh! Our mediawiki implementation requires a few specific meta tags to work with our company's search system, and with 1.16 it's all broken... it worked great previously. The metakeywordstag extension no longer works, and the metatag extension requires protected pages which does not support our collaborative environment. Any suggestions on a work-around?

Ilmari_Karonen added a comment.Via ConduitDec 10 2010, 2:33 AM

As Brion notes, turning the code he removed into a hook should be trivial. In fact, here it is: [[mw:User:Ilmari_Karonen/MetaKeywords]]. Stick that in a .php file somewhere, require_once() it from LocalSettings.php and you should be fine.

Note: I haven't actually tested the code, besides running php -l on it. If you find a bug, let me know.

Ilmari_Karonen added a comment.Via ConduitDec 10 2010, 2:34 AM

Pfft... stupid fake wikilink syntax. Here it is:

http://www.mediawiki.org/wiki/User:Ilmari_Karonen/MetaKeywords

bzimport added a comment.Via ConduitDec 10 2010, 2:40 AM

pzimmerm wrote:

Sweet... thanks a ton!!

Nemo_bis added a comment.Via ConduitJun 7 2014, 4:19 PM

I noticed around 2009 or so that the keywords were less useful for Wikiquote than for Wikipedia. 5 years later I went to check if that was still the case and I discovered this bug. :)

Anyway, it seems "keywords" is basically replaced by "description" now (bug 12196) and I think there is a remote chance of this happening in an extension instead: bug 66325.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.