Page MenuHomePhabricator

Select location for Asia Cache DC
Closed, ResolvedPublic

Description

This is primarily dependent on examining network and legal issues, but there's also cost considerations to consider, which touches on available vendors.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
BBlack claimed this task.

Singapore approved by Legal and selected, which is pretty much our ideal candidate on a range of issues.

a range of issues

It would be useful to document what aspects were considered beyond network, legal and cost. For instance, was environmental impact considered (cf. http://www.greenpeace.org/international/en/publications/Campaign-reports/Climate-Reports/clicking-clean-2017/ )?

Given that most commercial data centers in Singapore seem to based on 0% renewable energy, I would really like to know whether environmental concerns were taken into consideration when this decision was made.

Hi, where can we have a good discussion about the need to choose a datacenter that runs on renewable energy? I suppose that this bug is not the ideal location. Thanks for any pointers!

This probably isn't the ideal location, but I can speak to the issue here since it's obvious that some will come looking here for that answer. The TL;DR is that environmental considerations were not a major factor in this decision. The basics of this decision go something like this:

  1. The primary concern is global internet connectivity. Given the aim of the project and our current network structure, only a select few cities within SE/E Asia are viable candidates in terms of network topology at all. On this front we're looking at how the cities fare on being well-connected to a large number of submarine cables, hosting open and widely-used peering exchanges, having multiple carrier-neutral datacenter vendor options, and overall latency to populations across the region. Basic infrastructure issues like utility reliability are a concern here as well. On these fronts alone, the top two ideal locations were Singapore and Hong Kong. Tokyo was a third-place option with some caveats. Taipei and Seoul were possible last-resort backup choices if all else failed, but significantly less ideal than even Tokyo (to the point that we might have to rationally re-evaluate many aspects of the project).
  2. The next concern is basic Legal issues that have a direct impact on our ability to safely host servers. I can't speak to all of those, but our legal team does a great deal of detailed and privileged analysis on this front, looking primarily at issues that affect things like privacy and censorship issues on an operational level. Before even speaking to Legal, within Operations we had pretty much ruled out Hong Kong as an option due to it being under the ultimate control of the PRC and thus having "obvious" issues for us. Legal probably would have approved other backup cities on our list, but the critical choice here was whether Legal was willing to sign off on Singapore or not. If not, we'd be forced to consider second- and third- tier locations.
  3. Cost is obviously a concern, and continues to be one throughout the remainder of this project when looking at specific datacenter and network vendors as well as equipment purchasing. Cost considerations tend to put an extra nail in the coffin of considering Tokyo over SG, in the case that SG is legally-viable at all (which it did turn out to be).

With these other major factors in play, there's not a lot of room for environmental considerations to play a decisive role. In general, one of the ways in which we control both cost and environmental footprint within Operations is by being efficient in the first place. We deploy far fewer servers than most other top-tier web sites in the world when viewed on a per-user or per-traffic basis. Our cache sites in particular are extremely efficient in this sense, and are getting more-efficient in their design with every iteration. The current plan for this Asia Cache DC is a total of 17 physical servers and 2-3 network devices. When counted against the rest of our global deployment, this site will comprise.approximately 1.5% of our global server totals. Even if the primary concerns above allowed more room to make an environmentally-based decision, it doesn't make much rational sense to look too deeply into the environmental impact of a decision that affects such a small percentage of our global power needs.

Thank you for this information, Brandon. While your points are understandable, this does not mean that we should not try to find a vendor that uses renewable energy for their servers. So again: Where can we discuss this?

I think you can discuss that anywhere you like (within reason!).

To clarify re: your language above: we deploy our own server hardware as opposed to using virtual hosting, so the environmental issues there are separated - one for server hardware vendor selection (green issues in manufacturing of the servers, and also product operational efficiency) and one for datacenter vendor (green issues for the operational power and cooling for the servers).

On the hardware front, however, we don't tend to make arbitrary vendor choices for every deployment, as this would cause us a great deal of chaos in terms of operational knowledge, complexity, planning, and reliability. The very short list of available vendors for this project's servers are already set in stone at a higher level in that sense, and thus it's not really in scope for this project specifically. It's a broader and longer-term planning issue that you'd want to take up with Operations and Finance if you'd like the green-ness of our vendors re-evaluated for future purchases.

The datacenter vendor choice within Singapore is something where there might be a real and actionable green impact that's in-scope for this project, keeping the "1.5%" caveat above in mind in terms of efforts and benefits. As with the location decision, there are a lot of other major factors in choosing the datacenter vendor, but we may end up with a viable near-tie on the other issues between 2-3 vendors in this case, and thus may have the flexibility to consider green issues at that stage.

Thank you for your reply, Brandon. Maybe I should clarify my question: Where can Wikipedians have a discussion with you and your team about running Wikipedia's servers on renewable energy? The information you provided in your last two comments is very interesting, and I would like to address them at the appropriate venue. Thanks!

I really don't mean to be overly facile here, but if you're interested in having a discussion, we can have that at any usual public discussion venue. The wikitech mailing list might be a good start. Our IRC channels (e.g. Freenode #wikimedia-operations) work as well for informal, async discussion. Setting up a BOF or other sort of session on the topic at Wikimania might make sense as well. It would be helpful to understand better what sort of discussion you'd like to have. Is this a policy discussion about how the organization or Operations (or Finance) specifically weights green issues in various decisions in general? Or do you want to bring some information to the table that we might not be aware of about the evolving metrics of vendor green-ness and how to evaluate them? or.. ?

Oh, I've already tried writing to wikitech-l, which did not lead anywhere. I also asked to be added to ops-l, which was denied. I submitted a presentation to Wikimania 2016, which was not accepted (although it received top scores from some reviewers). At both the 2015 and 2016 Wikimanias, I sat down with a bunch of people and received many smiles and nods, but no commitments. Most of all, I set up a page on Meta back in 2015, and have been updating it since, but until now, the WMF Operations team has yet to weigh in on it, even though more than 200 community members have expressed their support for the initiative. This maybe helps you understand why I am asking for the right venue to have an actual discussion with your team about this.

I think the wikitech discussion seems like it was, in fact, a good discussion of the issue. So if your goal is discussion, I don't see the issue here. The metawiki page does contain a lot of information that comes from the Operations team.

In that discussion, Tim pointed out that in our last core DC site selection (codfw), environmental impact was considered, but that it didn't make cost sense. I think he made the point that while codfw was not great for CO2 emissions, buying offsets would have been cheaper than going with an alternative vendor. At that point the conversation is purely in the ethical and financial realm rather than the operational one. Should the WMF use a portion of our donation income to buy those carbon offsets? Any other option makes less financial sense and at best the same environmental sense.

"Did not lead anywhere" sounds like you're looking for some actionable outcomes and changes from the discussions that didn't happen, perhaps at a policy level? What is the goal you're trying to achieve? Perhaps the purchase of the codfw carbon offset with donor funds? That seems like the simplest step with the most impact available to us today, at least of Operations-related things. Obviously there's other areas to target, like Travel. A concern (e.g. environmental) can be advanced "for free" only to a certain limited extent, where it's one of our lower-priority considerations when all else is equal. Once you get past that basic level, furthering that aim is going to come at a real cost to some of our other aims, and so that becomes a policy and mission tradeoff.

The goal is to have Wikipedia's servers run on renewable energy. It's as simple as that. In Europe, this is a no-brainer, while I understand that it is not so much in the U.S. But Google, for example, has announced that it will reach this goal in 2017. Apple has been doing this since forever. And yes, we're not as big as these companies, but we share the same responsibility of saving this planet for our grandchildren. Buying carbon offsets is nonsense by the way, we have gone back and forth on this with help from Greenpeace before. And despite my continuous efforts in this regard, it seems that your team is planning to open yet another datacenter running on conventional energy. Isn't there anything we can do about this? Also, nobody has found the time for a phone call to Equinix and CyrusOne to ask if it is possible to have the existing servers switch to renewable energy (I have tried calling them, but they wouldn't talk to me). Tradeoff yes - but the planet is dying. It's on Wikipedia.

The goal is to have Wikipedia's servers run on renewable energy. It's as simple as that.

I don't think that's a realistic goal anytime soon.

In Europe, this is a no-brainer, while I understand that it is not so much in the U.S.

As much as it might be difficult in the US, it's even harder elsewhere, like Asia. Perhaps I've been presumptuous in assuming anyone understands the goal of this project - but the goal is to bring the edge of our network closer to billions of users in Asia specifically. We're not deploying in Asia for some kind of cost reasons out of a menu of many other global options. The goal of this project is to deploy in Asia specifically, so that users there can benefit from increased reliability and reduced latency to our content. Asia is not a place where renewable energy is a priority locally, yet. We can (and will!) ask about efficiency and power sourcing issues when talking to DC vendors there, but with a very small footprint we don't have a lot of influence.

But Google, for example, has announced that it will reach this goal in 2017. Apple has been doing this since forever. And yes, we're not as big as these companies,

The bigness argument is a big one here. Google operates on a completely different scale than we do by orders of magnitude. They build their own datacenters, and can negotiate or even install their own power generation. They're one of the most powerful entities on the planet at present. We lease small spaces within existing shared commercial datacenters. I'm no expert on Apple, but what little I know is that they also construct their own datacenters, and last I heard only in the US and Europe. It's not a minor point that these are big companies. They operate at a scale where they can do whatever they wish and set policy for everyone else involved, and we don't.

but we share the same responsibility of saving this planet for our grandchildren.

Of course, I empathize with this viewpoint personally. But that's neither here nor there on a pragmatic level.

Buying carbon offsets is nonsense by the way, we have gone back and forth on this with help from Greenpeace before.

That would be an excellent thing to outline in the wikitech thread or the meta page. I wasn't personally aware that buying offsets was a nonsense idea!

And despite my continuous efforts in this regard, it seems that your team is planning to open yet another datacenter running on conventional energy.

We're not opening yet another datacenter. The semantics are important here. Google opens whole new datacenters. We lease very small amounts of space in existing shared commercial facilities. We have an operational need to have a good edge presence in Asia to serve faster and more reliable free knowledge to billions. The locations for doing so (Asia in general, and good network cities on top of that) happen to be mostly conventionally-powered. The very efficient deployment we're considering there consumes roughly the same magnitude of power as a single average US household. A 100% commitment to renewables-only might mean we can't operate there at all. That seems like a big loss for the free knowledge movement over a very small amount of unclean power.

I think it's pretty easy to make a logical inference that 1 US household worth of unclean power purchasing in Asia bringing lower wiki latency to billions of potential eyeballs is a net win for the environment. Latency drives readership, readership of our projects brings more open, un-biased knowledge, and informed voters (or protesters, as the case may be!) tend to save the planet.

Environmental discussion seems to be better at https://meta.wikimedia.org/wiki/Sustainability_Initiative (mentioned already in this task) and now https://wikimediafoundation.org/wiki/Resolution:Environmental_Impact - best to continue over there...

The goal is to have Wikipedia's servers run on renewable energy. It's as simple as that.

I don't think that's a realistic goal anytime soon.

The goal as such is not open to discussions anymore, it has become mandatory. As on the same day, you wrote your reply, the Foundation board relased its [https://wikimediafoundation.org/wiki/Resolution:Environmental_Impact Environmental Impact Resolution]. And you as ops need to implement this resolution. The only question is how and when.

So I ask you to prepare and publish a roadmap, determining when ops can run all servers and all offices on green energy.

Henning

Feel free to bring up any further discussion topics on the talk page of https://meta.wikimedia.org/wiki/Sustainability_Initiative which is the centralized place.

No, moving this discussion to another forum on Meta is not an acceptable option. The board's resolution demands action by the ops, so this issue needs to be discussed with the ops, and this is the place for it.

So, I ask the ops involved to please reply to the question at hand:
How will you implement the board resolution and get green energy for all server hosting spaces? Please collect the necessary information and create a roadmap for transition to green energy for every hosting location.

@H-stt a bit more context. Sustainability was one of the important selection criteria for the previous datacenter. See the request for comment from 2013 at https://wikimediafoundation.org/wiki/RFP/2013_Datacenter#Primary_Requirements :

The environmental impact of the facility (cooling efficiency, reclaimed water, etc) will be an important consideration for final site selection.

It is pretty much the same for the Asia Cache selection which this task is about. One can thus see operations team has been proactive on that topic and I am quite happy to see the Wikimedia Foundation board has followed the lead with a formal resolution.

You are picking the wrong fight here. The resolution is already being implemented.

This is the last time I'll respond to trolling on this ticket.

The goal is to have Wikipedia's servers run on renewable energy. It's as simple as that.

I don't think that's a realistic goal anytime soon.

The goal as such is not open to discussions anymore, it has become mandatory. As on the same day, you wrote your reply, the Foundation board relased its [https://wikimediafoundation.org/wiki/Resolution:Environmental_Impact Environmental Impact Resolution]. And you as ops need to implement this resolution. The only question is how and when.

I'll try to break this down into pieces:

The board's role is to set the vision and the mission, and this resolution is similar in scope. The resolution intentionally leaves it open to the organization to reasonably interpret and implement policy. It does not actually set explicit policies within (e.g. X% renewable energy use for X by target date Z), as that's not the board's role. It is up to our C-Levels / Executives in coordination with Ops to come up with more-concrete plans about how and when, within the constraints of financial prudence set out in the resolution....

So I ask you to prepare and publish a roadmap, determining when ops can run all servers and all offices on green energy.

... but it is not up to you to set our agenda and demand a roadmap from Ops directly. Also, Ops has nothing to do with our offices.

No, moving this discussion to another forum on Meta is not an acceptable option. The board's resolution demands action by the ops, so this issue needs to be discussed with the ops, and this is the place for it.

So, I ask the ops involved to please reply to the question at hand:
How will you implement the board resolution and get green energy for all server hosting spaces? Please collect the necessary information and create a roadmap for transition to green energy for every hosting location.

These tickets are Ops' method of tracking task inter-dependencies related to the Asia Cache DC project, and this ticket in particular is a closed and resolved one about selecting a particular host city (which was Singapore). This ticket is not a general-purpose communications pipeline to the Ops group. It's also not a place where we receive random commands from arbitrary third parties dictating how we use our very constrained time and resources. I indulged the off-topic wandering earlier in this ticket in an attempt to be earnestly transparent and helpful, but that's clearly backfired and lead to these sorts of inappropriate responses.