Paginate API requests

Pagination is a common method used for handling large datasets and responses in web sites. It is also a useful method for API driven applications. We recently applied this method to our mobile app, SDK2, because we were experiencing crashes in the field.

We were able to identify the cause of the crashes. It turned out that the server was timing out when the app was downloading a list of records. Some of the record lists for our users are large datasets (hundreds of thousands of rows), and it was when one of these larger lists was downloading, we would see a crash. The solution was obvious; paginate the requests.

The basic idea is that the client will make a request to the endpoint that serves the records. The request will pass some extra parameters, "offset" and "limit", as a query string to the endpoint. The server is configured to accept these parameters from the request, and only responds with the limited amount of records. In order to get all of the records from the server, the client will need to make multiple requests, increasing the offset until all of the records have been downloaded. Take a look below at some sudo code.

fetchRemoteRecords(dataset) {
let offset = 0;
let limit = 150;
let fetchedRecords = [];
while(offset < dataset.record_count){
let response = await fetch(apiUrl + "/api/datasets/" + dataset.id + "/records?"+"offset="+offset+"&limit="+limit, {
method: 'GET'
});
fetchedRecords.push(response);
offset += limit;
}
await AsyncStorage.setItem('survey-' + survey.id + '-records', JSON.stringify(fetchedRecords));
}

This is how the client can paginate it's requests to the server. We start with an 'offset' of zero, which will let the server know to start the database query at the first index of the table. Next we set the variable 'limit' to 150. This will let the server know how many records we want per request. We then set an empty array as the value of the variable 'fetchedRecords'. This will collect the records after each request. The requests are then made within the 'while' loop that will run as long as the value of 'offset' is less than the total amount of records in the dataset. We make this easy by having the record count as a property of the dataset object itself. Inside the loop, we make a request with the 'limit' and 'offset' passed as parameters. The response returns, and the records are pushed into the fetchedRecords array. The last thing that happens is that the offset gets increased by the value of limit, and the loop repeats. When all of the records have been downloaded, the loop will end, and all of the records will be set in the local storage of the device. This only works if the client is making a request to a server that is configured to accept these parameters, and knows what to do with them. Let's see how we did that. Here is a bit of how our server works for these requests.

def index
limit = params[:limit] || 150
limit = 150 if limit.to_i > 150
@records = @survey.records.limit(limit).offset(params[:offset])
respond_with @records
end


Pretty straight forward and simple. You can see here that we set a hard limit of 150 records, to avoid any requests that might be too large and cause a crash, defeating the entire purpose of paginating. The limit and offset parameters are then passed as arguments to the query for the records. This endpoint will get hit with requests repeatedly until all of the records have been sent.

If we could listen to the conversation between the client and server during the process, it would sound something like this:
Client: "Give me the first 150 records from this dataset."
Server: "Ok. Here you go."
Client: "Give me next 150 records from this dataset."
Server:"Ok. Here you go."
Client: "Give me next 150 records from this dataset."
Server:"Ok. Here you go."
... and so on until the client's offset for the request exceeds the total amount of records.

In conclusion, paginating your API requests is a great idea, and it is pretty simple to pull off. Now, while it won't really give you any performance gain as far as speed, it will prevent your requests from causing long running queries that will ultimately time out and cause the client to crash.

Matt
Nov. 09 2018