Page MenuHomePhabricator

Phabricator should sustain crawling by internet search engines
Closed, InvalidPublic

Description

https://gerrit.wikimedia.org/robots.txt : gerrit is currently well crawled, e.g. https://www.google.it/search?q=site%3Agerrit.wikimedia.org .

It seems most instances don't allow crawling of the code review portion, would need to confirm it is possible.
https://secure.phabricator.com/robots.txt
https://developer.blender.org/robots.txt
http://reviews.llvm.org/robots.txt

Details

Reference
fl252

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

flimport raised the priority of this task from to Low.Sep 12 2014, 1:33 AM
flimport set Reference to fl252.

qgil wrote on 2014-04-29 23:12:15 (UTC)

I guess in a worst case scenario we can patch the robots.txt of our instance, applying the same policy as Gerrit, Bugzilla, etc?

mattflaschen wrote on 2014-04-30 02:10:30 (UTC)

I think this is actually just a consequence of T262: Review holders of commit rights in WMF deployed extensions (if anonymous users can't view, neither can GoogleBot)..

The current robots.txt excludes only /diffusion/ , the repository browser (e.g. http://fab.wmflabs.org/diffusion/MW/repository/master/ ). It does not exclude Differential reviews, e.g. http://fab.wmflabs.org/D1 .

We're currently forbidding crawling the code at https://git.wikimedia.org/robots.txt too . It might be nice to allow crawling the code somewhere (if it can be done performantly), but since our other repository viewer does the same there's no regression I see here.

Differential (code review) entries D1 should show up in search, though. They do not because of T262.

Qgil closed this task as Invalid.Sep 24 2014, 2:57 PM
Qgil claimed this task.

I just searched for "Phabricator should sustain crawling by internet search engines" in Google, and this task appeared as the first result. Closing as Invalid. Please reopen if there is any content in this instance that should be crawled and it is not.