The Ultimate Tutorial For Django Rest Framework: Pagination (Part 4)

Dominik Kozaczko

6 February 2019, 5 min read

What's inside

I’m back with another part of my tutorial for Django REST framework.

Be sure to catch up with the work we’ve completed in other parts of the series:

Today, I wanted to take a closer look at another issue: pagination.

Note 1: Throughout this article, I refer to this excellent source of best practices for a pragmatic RESTful API.

Note 2: You can find the project code we’re working on in this series in this repository.

Let’s now delve into the problem of pagination in the Django REST Framework.

Why pagination?

Have a look at the standard answer at one of our endpoints.

$ curl http://127.0.0.1:8000/api/v1/friends/

[{"id":1,"name":"John Doe","has_overdue":true},{"id":2,"name":"Frank Tester","has_overdue":false}]

We got a standard list of objects. So far, so good.

But what if our endpoint returns thousands of objects? The serialization and transmission of that amount of data may take long enough for the application user to notice a downtime.

We can solve this problem by including pagination; the division of results into pages of a fixed size. The best strategy here is using the limit + offset method, where the parameter `limit` passed in GET specifies the number of elements per page, and` offset` determines the offset in relation to the beginning of the list.

That’s the universal way to handle the more traditional transition between subpages, as well as the sometimes desirable method called "infinite scroll."

The documentation recommends adding the following entries to the REST_FRAMEWORK settings in settings.py:

REST_FRAMEWORK = {
    ....
    'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.LimitOffsetPagination',
    ‘PAGE_SIZE’: 100,
    ...
}

That results in the following (I used the limit parameter and formatted the result for readability using json_pp):

$ curl http://127.0.0.1:8000/api/v1/friends/?limit=1 | json_pp { "previous" : null, "next" : "http://127.0.0.1:8000/api/v1/friends/?limit=1&offset=1", "count" : 2, "results" : [ { "id" : 1, "has_overdue" : true, "name" : "John Doe" } ] }

As you can see, the results were enveloped. This practice is justified when the client can’t handle HTTP headers, but it’s slowly becoming obsolete these days.

The most recent guides to best practices recommend the transmission of metadata in headlines while allowing enveloping on demand. We will implement this solution below.

We follow these assumptions:

The endpoint must return a list of objects in the same structure as initially. Pagination is carried out using the `limit` and` offset` parameters.
Additional metadata are included in the appropriate headings.
The code is to use packaging on demand (provided by the parameter).
The code always contains links in the headers - even if the version with the packaging has been chosen.

Note: There exists a django-rest-framework-link-header-pagination library, but it doesn’t implement the limit / offset mechanism which is of interest to us here.

The simplest solution will be inheriting the class rest_framework.pagination.LimitOffsetPagination because we have most of the logic implemented there.

To begin, let's handle the parameter that turns enveloping on:

from collections import OrderedDict
from rest_framework.pagination import LimitOffsetPagination
from rest_framework.response import Response
from rest_framework.utils.urls import replace_query_param, remove_query_param


class HeaderLimitOffsetPagination(LimitOffsetPagination):
    def paginate_queryset(self, queryset, request, view=None):
        self.use_envelope = False
        if str(request.GET.get('envelope')).lower() in ['true', '1']:
            self.use_envelope = True
        return super().paginate_queryset(queryset, request, view)

We can later write a method that returns data:

def get_paginated_response(self, data):
    next_url = self.get_next_link()
    previous_url = self.get_previous_link()

    links = []
    for url, label in (
        (previous_url, 'prev'),
        (next_url, 'next'),
    ):
        if url is not None:
            links.append('&lt;{}&gt;; rel="{}"'.format(url, label))
    headers = {'Link': ', '.join(links)} if links else {}
    if self.use_envelope:
        return Response(OrderedDict([
            ('count', self.count),
            ('next', self.get_next_link()),
            ('previous', self.get_previous_link()),
            ('results', data)
        ]), headers=headers)
    return Response(data, headers=headers)

To make it all work in line with best practices, we only need links to the first and last page.

Let's add these two methods:

def get_first_link(self):
    if self.offset &lt;= 0:
        return None
    url = self.request.build_absolute_uri()
    return remove_query_param(url, self.offset_query_param)

def get_last_link(self):
    if self.offset + self.limit &gt;= self.count:
        return None
    url = self.request.build_absolute_uri()
    url = replace_query_param(url, self.limit_query_param, self.limit)
    offset = self.count - self.limit
    return replace_query_param(url, self.offset_query_param, offset)

All that remains is completing the `get_paginated response` method with the following form:

def get_paginated_response(self, data):
    next_url = self.get_next_link()
    previous_url = self.get_previous_link()
    first_url = self.get_first_link()
    last_url = self.get_last_link()

    links = []
    for label, url in (
        ('first', first_url),
        ('next', next_url),
        ('previous', previous_url),
        ('last', last_url),
    ):
        if url is not None:
            links.append('&lt;{}&gt;; rel="{}"'.format(url, label))
    headers = {'Link': ', '.join(links)} if links else {}
    if self.use_envelope:
        return Response(OrderedDict([
            ('count', self.count),
            ('first', first_url),
            ('next', next_url),
            ('previous', previous_url),
            ('last', last_url),
            ('results', data)
        ]), headers=headers)
    return Response(data, headers=headers)

Where to put all that code?

The best place to put this code is a separate file that can be easily imported from anywhere in the project.

Let's assume that we create a `pagination.py` file containing the above class in our book rental application. We will change the REST_FRAMEWORK configuration to this:

REST_FRAMEWORK = {
...
'DEFAULT_PAGINATION_CLASS': 'rental.pagination.HeaderLimitOffsetPagination',
'PAGE_SIZE': 100,
}

You can also use the library I prepared with the code above by installing `pip install hedju` and later as DEFAULT_PAGINATION_CLASS you can use 'hedju.HeaderLimitOffsetPagination'.

Since everything is ready, all that’s left is API testing; curl with the -v parameter will show us headers (I’ve removed irrelevant information):

$ curl “http://127.0.0.1:8000/api/v1/friends/?limit=1” -v \*   Trying 127.0.0.1... ... < Content-Type: application/json < Link: <http://127.0.0.1:8000/api/v1/friends/?limit=1&offset=1>; rel="next", <http://127.0.0.1:8000/api/v1/friends/?limit=1&offset=1>; rel="last" [{"id":1,"name":"John Doe","has_overdue":true}]

$ curl “[http://127.0.0.1:8000/api/v1/friends/?limit=1&envelope=true](http://127.0.0.1:8000/api/v1/friends/?limit=1&envelope=true)” { "last" : "http://127.0.0.1:8000/api/v1/friends/?envelope=true&limit=1&offset=1", "next" : "http://127.0.0.1:8000/api/v1/friends/?envelope=true&limit=1&offset=1", "first" : null, "results" : [ { "id" : 1, "has_overdue" : true, "name" : "John Doe" } ], "previous" : null, "count" : 2 }

Done!

As a curiosity, I wanted to mention the support for navigation through headers in the requests library:

In [1]: import requests
In [2]: result = requests.get('http://127.0.0.1:8000/api/v1/friends/?limit=1')
In [3]: result.links
Out[3]:
{'next': {'url': 'http://127.0.0.1:8000/api/v1/friends/?limit=1&amp;offset=1',
'rel': 'next'},
'last': {'url': 'http://127.0.0.1:8000/api/v1/friends/?limit=1&amp;offset=1',
'rel': 'last'}}

That's all, folks! In the next part, I’ll discuss the subject of filtering the data list.

The Ultimate Tutorial for Django REST Framework: Pagination (Part 4)

Dominik Kozaczko

What's inside

Dominik Kozaczko

Backend Engineer

Recent posts

Why data engineers don’t test - according to Reddit

Modern Data Stack with Airflow and dbt - going into the cloud (part 2)

Testing in dbt - part 3

Why data engineers don’t test - according to Reddit

Modern Data Stack with Airflow and dbt - going into the cloud (part 2)

Testing in dbt - part 3

Let's talk