Page MenuHomePhabricator

Paging support
Closed, ResolvedPublic

Description

We currently don't have any paging support in restbase or restbase-cassandra. It would be great to add generic support.

A standard way to do this is via an opaque token in a 'next' property in the result, next to 'items':

{ 
  items: [...],
  next: 'sometoken_deadbeef'
}

The Cassandra driver supports paging with such a token as well.

We could consider returning an URI instead of a token that we expect to be sent in a JSON GET body (or POST). We should also consider what happens if malicious users pass in random (or not so random) tokens.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
DeclinedNone
Resolved GWicke

Event Timeline

GWicke raised the priority of this task from to Medium.
GWicke updated the task description. (Show Details)
GWicke set Security to None.
GWicke subscribed.

I think the best way would be to use the same GET URI for the next page, but with an extra query parameter containing the next token ID. This way, we'd have the same route. As for malicious users with random tokens, we could keep the generated response tokens in a cache for a (limited) period of time and consult it when a request token comes in. If it's there, it's consumed, otherwise a 4xx client error is returned. That, however, means that tokens are not reusable, but that shouldn't be a problem AFAIK.

Another thing to decide on would be how would be initiate the paging request?

  1. We could automatically page requests with more than say 1000 results and insert a next property in it
  2. We can also specify "paging = 100" or "paging = 1000" in the GET request, this will be done to support paging on user choice

Let me know what you guys think?

just a found this comment in the test which suggest to use "from" in the GET request
https://github.com/wikimedia/restbase-cassandra/blob/master/test/index.js#L535

I think the best way would be to use the same GET URI for the next page, but with an extra query parameter containing the next token ID. This way, we'd have the same route. As for malicious users with random tokens, we could keep the generated response tokens in a cache for a (limited) period of time and consult it when a request token comes in. If it's there, it's consumed, otherwise a 4xx client error is returned. That, however, means that tokens are not reusable, but that shouldn't be a problem AFAIK.

Another option to avoid token tampering would be to add a hash of the parameters with a secret salt to the token, and then verify that signature on the way back. This secret salt would need to be known to all instances of restbase. The good thing is that this avoids the need for a shared cache.

Another thing to decide on would be how would be initiate the paging request?

Basically, any request that has more results than provided in the first page (so with a limit higher than the page size) should have the next link. The typical way to figure out whether that is the case is to select one more result than needed for the first result page. Would need to make sure that our 'next' points to the proper next page, with that extra result as the first entry.

Basically, any request that has more results than provided in the first page (so with a limit higher than the page size) should have the next link

@GWicke: how do we decide on the Page Size? Can user ask for a paged response as well?

So say If a user want to get results with paging he would send a simple request like

{
                url:'/v1/restbase.cassandra.test.local/simpleTable/',
                method: 'get',
                body: {
                    table: "simpleTable",
                    paging: 5, //a paged request with only five results at a time 
                    attributes: {
                        key: 'testing',
                    }
                }
}

do we want to support this functionality as well?

I just opened a pull request which currently handle most of the stuff https://github.com/wikimedia/restbase-cassandra/pull/52

Another questions I have is nodejs driver seems to emit a byte array instead of a token, do anyone have any idea on how to convert the byte array into a token?

GWicke reopened this task as Open.EditedFeb 22 2015, 9:32 PM

Lets reopen this, as this hasn't actually been implemented for restbase yet.

In the public interface, I think it makes sense to consider using HAL for the next pointer:

{
    "_links": {
        "next": { "href": "?next=aGVsbG8gd29ybGQ=" }
    }
    items: [ ... ]
}

The public next token should be opaque and not user-editable. This example is just a base64-encoded 'hello world', but in real life we might want to use a signed string. One option that comes to mind is a JWT with simple salted hash signature (but this might be a bit heavy), the other a simple custom base64(value) + '.' + base64(salted_hash) scheme.

Example with jsonwebtoken:

$ npm install jsonwebtoken
npm WARN package.json yaml@0.2.3 No repository field.
npm WARN engine jsonwebtoken@3.2.2: wanted: {"npm":">=1.4.28"} (current: {"node":"0.10.29","npm":"1.4.21"})
jsonwebtoken@3.2.2 node_modules/jsonwebtoken
└── jws@1.0.1 (jwa@1.0.0, base64url@1.0.4)
$ node
> var jwt = require('jsonwebtoken');
undefined
> jwt.sign('pagingparameter', 'secret')
'eyJhbGciOiJIUzI1NiJ9.cGFnaW5ncGFyYW1ldGVy.P7WRYem2GVH99EWsvOw-dqVPM4x_eTqTi3i4FYIeSes'

Not terribly compact, but using sha256 & a standard algorithm.

This is now merged. Should go out to production later this week.