https://en.wikipedia.org/robots.txt is kind of messy. There are some commented out lines. There are some lines that are clearly not applicable to en.wikipedia.org. We should clean it up at some point.
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Tidy robots.txt | operations/mediawiki-config | master | +176 -152 | |
Remove redundant entries from robots.txt | operations/mediawiki-config | master | +2 -60 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | Feature | None | T16720 robots.txt (tracking) | ||
Open | None | T104251 Move wiki-specific robots.txt out of the global file to Mediawiki:Robots.txt on specific wikis | |||
Resolved | Mdann52 | T104949 Fix or remove robots.txt code for Internet Archive exclusion of user pages |
Event Timeline
Comment Actions
Change 239403 had a related patch set uploaded (by Glaisher):
Remove redundant entries from robots.txt
Comment Actions
https://en.wikipedia.org/robots.txt looks better, but we're still including lots of directives that aren't applicable to the English Wikipedia.
Now that we have https://en.wikipedia.org/wiki/MediaWiki:Robots.txt and friends, I think it would be nice to clean out the global robots.txt as much as possible. Whether we track this work using this task or a new task, I don't really care.
Comment Actions
I think that should be done as part of this task and I've updated the title accordingly. We need:
- A global interface admin to add the directives for the specific sites to those sites's robots.txt files;
- A developer to remove the directives for the specific sites from the global robots.txt.