This presentation is a part of our weekly talks. This week, our CTO Przemek talked to us about the use of PostgreSQL and JSON in combination with Python.
Check out the content of the presentation below or watch the video (in Polish). Przemek’s presentation slides are available here.

Why use JSON in our databases at all?

Database experts want order in their databases. JSON comes in handy for managing data that has no schema but we still want to include it in our database for later organization.

Sometimes we might have data where the schema is unknown – for example, when we download undocumented data from API. That’s also where JSON can help.

It’s also a good solution for storing documents we need in a single place to avoid expensive joins later on.

In some applications, the user establishes their own schema – for example, the ecommerce context the user may want to define product characteristics. There exist solutions for making that work in a relational database, but doing it in JSON simply makes our lives easier.

PostgreSQL types

PostgreSQL has been introducing novel solutions on a regular basis. Since version 8.2, it introduced the possibility of storing XML and HStore documents in databases.

XML: PostgreSQL just checks whether XML documents are correct and validates them. Users can also search through XML.

HStore: it’s a simple key value storage, both keys and values that end up in HStore become strings.

JSON: available since version 9.2, JSON ended up being quite similar to XML – JSON type is a string and the only thing users can do is validate whether its syntax is correct.

JSONB: available since version 9.4, full implementation of JSON, data is recorded binarily.

HStore vs JSON vs JSONB

Essentially, these 3 data types are continuous steps of the evolution of PostgreSQL’s ability to store unstructured data. There are several thing you should consider when using either of these options.

HStore:

    • Accepts only simple key/value pairs,
    • Includes only strings,
    • Doesn’t allow nesting (at least not directly, but you can always use another key as value since they’re all strings),
  • Allows GiST index which is suitable for traversing hierarchical data.

JSON:

    • Behaves like a text type that validates whether the text has an adequate structure.
  • Retains the exact structure of input document, including unsorted and duplicated keys.

JSONB:

    • The most sophisticated option out of these three,
    • Operates on all JSON data types,
    • Allows nesting,
  • Allows GIN indexing, suitable for traversing key/value documents.

JSON can be cast to JSONB and vice versa, so choosing between them is also a matter of deciding if you need faster writes (JSON) or faster queries (JSONB).

JSONB

JSONB is a full JSON implementation that offers a binary data storage. Note that it doesn’t provide developers with a key order or duplicate preservation. There’s also no date type. On the other hand, JSONB allows quick access operations and easy indexing.

SQLAlchemy

SQLAlchemy is a Python library for database service that allows using the three types of data mentioned above. It supports developers in writing JSON/JSONB queries with the help of Python. Have a look at this documentation to learn more about SQLAlchemy.

Django ORM

So how does the situton look like in Django? For a long time, developers had no support for PostgreSQL. There were some libraries available like Django HStore that were external applications. The community created support for PostgreSQL and these data types. At some point, the Django community decided to create something similar to SQLAlchemy. In Django 1.8, we had HStoreField and since 1.9, JSONField which uses JSONB.

Have you got any questions about making PostgreSQL and JSON work with Python?

Leave us a comment; we’re always happy to share our knowledge and start a conversation about best practices in the community.

Przemek Lewandowski
Przemek
CTO

Przemek is a co-founder and CTO of Sunscrapers. He’s a graduate of Warsaw University of Technology which lead him to become a software consultant. At Sunscrapers, Przemek is a technical leader who supervises high-quality service delivery, helps to solve problems and mentors other team members. Przemek is a passionate community activist who organises and speaks at tech events and leads open source projects.

LOAD COMMENTS

arrow

Growth & culture

What Are the Benefits of Outsourcing IT Services?

Whether it’s a large enterprise or a small startup, businesses of all kinds are now choosing to outsource IT services for many different reasons. Naturally, outsourced services offer many [...]

Growth & culture Project management Startups

Agile software development: 7 best practices

At Sunscrapers, we believe that Agile is more than a collection of software development techniques or ceremonies. For us, Agile is a group of methodologies teams can mix as [...]

Join our newsletter.

Scroll to bottom

Hi there, we use cookies to provide you with an amazing experience on our site. If you continue without changing the settings, we'll assume that you're happy to receive all cookies on the Sunscrapers website. You can change your cookie settings at any time.

Learn more