Page MenuHomePhabricator

Public entry point for RESTBase
Closed, ResolvedPublic

Description

To make RESTBase useful to the public, we need to expose an end point. Requirements:

  • HTTPS termination, ideally SPDY support (nginx 1.6?)
  • a sensible subdomain

It is not clear whether caching would improve performance overall, so it's not part of the requirements. We can investigate that later. For now, we could set up nginx to listen on 80 & 443, using the existing *.wikimedia.org wildcard cert.

Candidates for the domain:

  • api.wikimedia.org: simple and obvious domain, unused so far
  • content.wikimedia.org: possibly less contentious; downsides: less obvious, would likely change in the longer term, longer, less accurate since not only content is exposed
  • rest.wikimedia.org: simple and relatively general, if not *completely* obvious; provides a nice way to distinguish 'the action api' vs. 'the rest api' that we can keep for a while

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.
StatusSubtypeAssignedTask
Resolved GWicke
Resolved GWicke

Event Timeline

GWicke raised the priority of this task from to Needs Triage.
GWicke updated the task description. (Show Details)
GWicke added a project: RESTBase.
GWicke changed Security from none to None.
GWicke updated the task description. (Show Details)
GWicke edited subscribers, added: BBlack, faidon, mark, Catrope; removed: Aklapper.
GWicke subscribed.

What would be the disadvantages of using api.wikimedia.org?

What would be the disadvantages of using api.wikimedia.org?

To some it implies that the restbase API is now the definitive and exclusive Wikimedia API, which it is not at this point. I can understand that point of view, but also think that we should not shy away from using obvious domain names for fear of its subtle implications.

I don't want to block on this though. Getting this out of the door is more important than the name.

Thanks Gabriel. We can work on a LVS / Varnish / HTTPS layer for this, and get started on this.

However we can't offer SPDY at this time. We'd really like to offer SPDY (and/or HTTP 2) in the near future, but we're not nearly ready for this, and won't be for at least another quarter. We need to figure out what we're gonna do with our HTTPS layer first, and have just started work on this. This will also determine what we can do around SPDY & HTTP2. (And no, it's not a straightforward upgrade to a newer nginx right now. :) As SPDY doesn't appear to be a strict requirement for RESTbase at all, I'd like to decouple it from this project, and revisit it later.

What would be the disadvantages of using api.wikimedia.org?

To some it implies that the restbase API is now the definitive and exclusive Wikimedia API, which it is not at this point.

Count me as part of "some". :-) I think we made a mistake during the years that we referred to what we now call the Action API as simply "the API". Having "api.wikimedia.org" puts us right back in that spot.

I can understand that point of view, but also think that we should not shy away from using obvious domain names for fear of its subtle implications.

The most obvious domain name to me is restbase.wikimedia.org . Any reason not to use that?

What would be the disadvantages of using api.wikimedia.org?

To some it implies that the restbase API is now the definitive and exclusive Wikimedia API, which it is not at this point.

Count me as part of "some". :-) I think we made a mistake during the years that we referred to what we now call the Action API as simply "the API". Having "api.wikimedia.org" puts us right back in that spot.

I do think that we should eventually re-unify the API, but that is not a short-term project.

The most obvious domain name to me is restbase.wikimedia.org . Any reason not to use that?

I'm not a huge fan of calling interfaces after their current implementation. We don't use 'mediawiki.wikimedia.org', 'varnish.wikimedia.org' or 'nginx.wikimedia.org' either.

I'm happy with 'content' for now, as that describes what this interface (at least initially) focuses on.

In T78194#948874, @mark wrote:

Thanks Gabriel. We can work on a LVS / Varnish / HTTPS layer for this, and get started on this.

That would be great! For performance we probably only want a single varnish layer, as RESTBase is really intended to replace the second layer for most purposes.

However we can't offer SPDY at this time. We'd really like to offer SPDY (and/or HTTP 2) in the near future, but we're not nearly ready for this, and won't be for at least another quarter. We need to figure out what we're gonna do with our HTTPS layer first, and have just started work on this. This will also determine what we can do around SPDY & HTTP2. (And no, it's not a straightforward upgrade to a newer nginx right now. :) As SPDY doesn't appear to be a strict requirement for RESTbase at all, I'd like to decouple it from this project, and revisit it later.

Yeah, it's not a requirement. However, maybe we can spin up a box with jessie soon and give nginx 1.6 a try. The requirements for restbase & small services are pretty straightforward (no need for udplog for example), and our first real users will be all internal and thus should not even hit nginx. This could be a good opportunity to get some real-world data from a simplified use case before looking into a wider roll-out.

Regarding logging, we could also consider logging directly from restbase to kafka. @Ottomata helpfully provided an example log line:

{"hostname":"amssq48.esams.wikimedia.org","sequence":703148431,"dt":"2014-12-30T18:06:37","time_firstbyte":0.483491659,"ip":"X.X.X.X","cache_status":"miss","http_status":"200","response_size":50366,"http_method":"GET","uri_host":"fr.wikipedia.org","uri_path":"/wiki/Oblast_de_Moscou","uri_query":"","content_type":"text/html; charset=UTF-8","referer":"http://fr.wikipedia.org/wiki/Alexe%C3%AF_Navalny","x_forwarded_for":"-","user_agent":"NonyaBiznaaaz","accept_language":"fr-FR","x_analytics":"php=hhvm","range":"-"}

The downside is that it would not log small services until they are hooked up with restbase. We'd also need to keep up with any changes in the varnishkafka config.

The downside is that it would not log small services until they are hooked up with restbase. We'd also need to keep up with any changes in the varnishkafka config.

How about separating logging into a separate RESTBase module/service which exposes a (public) URI endpoint? That way both services hooked on RESTBase and external ones (but maybe IP-whitelisted) could send their logs directly through it.

However we can't offer SPDY at this time. We'd really like to offer SPDY (and/or HTTP 2) in the near future, but we're not nearly ready for this, and won't be for at least another quarter. We need to figure out what we're gonna do with our HTTPS layer first, and have just started work on this. This will also determine what we can do around SPDY & HTTP2. (And no, it's not a straightforward upgrade to a newer nginx right now. :) As SPDY doesn't appear to be a strict requirement for RESTbase at all, I'd like to decouple it from this project, and revisit it later.

Yeah, it's not a requirement. However, maybe we can spin up a box with jessie soon and give nginx 1.6 a try. The requirements for restbase & small services are pretty straightforward (no need for udplog for example), and our first real users will be all internal and thus should not even hit nginx. This could be a good opportunity to get some real-world data from a simplified use case before looking into a wider roll-out.

Yes, maybe, but I really don't want to commit on that point. So let's keep the requirements for this deployment simple & standard, ship it, and then see how we can improve on it for this and all other clusters. We'd really like to see SPDY/HTTP2 improvements as well, but other SSL/HTTPS related work is more urgent at this point.

What would be the disadvantages of using api.wikimedia.org?

To some it implies that the restbase API is now the definitive and exclusive Wikimedia API, which it is not at this point.

Count me as part of "some". :-) I think we made a mistake during the years that we referred to what we now call the Action API as simply "the API". Having "api.wikimedia.org" puts us right back in that spot.

I do think that we should eventually re-unify the API, but that is not a short-term project.

Yeah. That needs additional discussion (hopefully some at the dev summit as well), and can't be decided here.

The most obvious domain name to me is restbase.wikimedia.org . Any reason not to use that?

I'm not a huge fan of calling interfaces after their current implementation. We don't use 'mediawiki.wikimedia.org', 'varnish.wikimedia.org' or 'nginx.wikimedia.org' either.

I'm happy with 'content' for now, as that describes what this interface (at least initially) focuses on.

I too think that content.wikimedia.org is a reasonable compromise for now. We could think about an .api.wikimedia.org subdomain as well, content.api.wikimedia.org - although that might suggest that we're against reunifying the API.

How about rest.wikimedia.org or restapi.wikimedia.org?

rest.wikimedia.org would wfm too.

big +1 to rest.wikimedia.org or rest.api.wikimedia.org

gerritbot subscribed.

Change 188537 had a related patch set uploaded (by Filippo Giunchedi):
public entry point for restbase

https://gerrit.wikimedia.org/r/188537

Patch-For-Review

Change 188537 merged by Filippo Giunchedi:
public entry point for restbase

https://gerrit.wikimedia.org/r/188537

@akosiaris, @fgiunchedi: Thanks for setting up the domain. For RESTBase, I believe the main missing bits are:

  • set up an LVS for RESTBase
  • Point rest.wikimedia.org to it

Anything else I'm forgetting?

we'd need to point parsoid varnishes to restbase.svc.eqiad.wmnet too, part of this task I think

@fgiunchedi yup, just a simple FWD for rest.wikimedia.org to restbase@eqiad, port 7231 should be enough.

Change 191061 had a related patch set uploaded (by Alexandros Kosiaris):
Reuse parsoid varnish for restbase

https://gerrit.wikimedia.org/r/191061

Patch-For-Review

Change 191061 merged by Alexandros Kosiaris:
Reuse parsoid varnish for restbase

https://gerrit.wikimedia.org/r/191061

This is now live: https://rest.wikimedia.org/en.wikipedia.org/v1/?doc

Thank you, @akosiaris and @fgiunchedi!

Lets track further improvements (SPDY, don't go through Parsoid Varnishes) in separate tickets.