Our datasets get crawled all the time and some of them are a few MB. We could disallow all crawling on datasets to help reduce bandwidth usage. But is it good for any reason to get them crawled? I mean we can link to specific folders from the wikis if we want them to be searchable on the web, right?
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Prevent datasets from being crawled | analytics/analytics.wikimedia.org | master | +2 -0 |
Related Objects
Related Objects
Event Timeline
Comment Actions
@Peachey88 not particularly, this is low priority, but it just seems like a bad idea to waste it for no reason, especially on larger files like datasets. I mean it downloads the whole thing just to go: oh, not HTML, moving on.
Comment Actions
Change 345634 had a related patch set uploaded (by Milimetric):
[analytics/analytics.wikimedia.org@master] Prevent datasets from being crawled
Comment Actions
Change 345634 merged by Milimetric:
[analytics/analytics.wikimedia.org@master] Prevent datasets from being crawled