In this post, I wanted to talk about an idea about a different approach to organizing business logic code in Django Projects that I’ve been toying with for a while now – it’s still work-in-progress, so I’m looking forward to hearing your thoughts and comments.
Let’s start with the basics: the business logic layer (as I will be using during this presentation) is a layer in a piece of software that encodes real-world rules that determine how data is created, displayed, stored, and changed. Its linked to the primary business problem your software aims to solve.
The business logic layer prescribes how business objects interact with one another and defines the methods by which these objects can be updated.
So where do we put business logic when coding in Django framework?
Here are 4 possible solutions with their pros and cons.
Idea #1: Fat Models
I believe that the “default” idea supported by Django documentation is to make your models “fat.” To put it colloquially, this approach suggests that if you don’t know where some of the code should go and it somehow relates to the object (and what doesn’t?), then it’s best to put it in the model.
The idea – often presented as a common pattern in MVC-style programming – is to build thick/fat models and thin controllers. In Django, that translates to thick models with lots and lots of small methods and thin views which use those methods and keep their logic as minimal as possible. There are lots of benefits to this approach.
Why adopt this approach? Here are 3 reasons:
- It keeps your code dry – rather than repeating code in views or forms, you put everything into the model and since you might have many views or forms manipulating the same models, you get to keep your code dry. That’s the common opinion, but I’m not sure about it – more on that later on.
- It’s testable – breaking up logic into small methods in the model makes your code easier to unit test.
- Readability – because it’s testable, object-related code is also easy to comprehend. By giving your methods friendly names, you can abstract ugly logic into something that is easy to read and understand.
Here’s an example:
Let’s say your task is building a company blog. Here’s what your code could look like:
class Post(Model): STATUS_DRAFT, STATUS_APPROVED, STATUS_PUBLISHED = range(3) STATUS_CHOICES = ( (STATUS_DRAFT, _('Draft')), (STATUS_APPROVED, _('Approved')), (STATUS_PUBLISHED, _('Published')) ) title = models.CharField(max_length=255) text = models.TextField() status = models.PositiveSmallIntegerField( choices=STATUS_CHOICES, default=STATUS_DRAFT )
Note the Statuses that appear in the code (DRAFT, APPROVED, PUBLISHED). Let’s consider the business logic now: Someone may ask you to implement a functionality to prevent blog posts that haven’t been approved by a supervisor from being published automatically.
So you’ll probably write something like that:
class Post(Model): ... def approve(self): self.status = Post.STATUS_APPROVED self.save() def publish(self): if self.status != Post.STATUS_APPROVED: raise PostInvalidStatusError() self.status = Post.STATUS_APPROVED self.save()
It’s a small and pretty straightforward – just right for a blog. And look how readable it is. If someone new came on board and had to deal with that code base, your intention would be perfectly clear to that person.
But the method has its cons as well: if you have a bigger system, it may lead to very large models in your Django apps. That method is hard to maintain in large codes and can result in badly optimized solutions (like sub-optimal database queries or network requests, that I will talk about later on).
Idea #2: Putting Business Logic in Views/Forms
Another idea is to place business logic in forms or views.
Though I’ve seen it being seriously suggested, I don’t think that putting business logic in forms, views, or serializers is a clean idea. I don’t mean that you shouldn’t put any logic there at all: for example, logic related to dynamic order modification or user preferences works fine here. But logic that relates to placing an order, saving a blog post, or validating the status of a blog post inside a form is a silly idea.
The main reason for that is the miniscule chance that the form, view, or serializer will ever serve as the only entry point of that particular scenario. While at the moment you might only have one form and view for editing blog post, you’re making designing future API or accessing and manipulating your blog posts through management commands, or background tasks, or anything else for that matter, unnecessarily difficult.
Just don’t do that.
Idea #3: Services
Another commonly advocated idea is that it’s worth to add a separate layer of code called services between views and models. That is quite unusual in a vanilla Django system but some projects successfully employ that method.
The idea is that you divorce your models, leaving only the definitions of your objects and their relations to other objects from your services or actions of behaviours (sometimes it’s called differently), where you put what the objects can do or what you can do with them.
An immediate positive consequence of that approach is that one object of a particular kind is no longer the default domain or scope, if you will, of an action. So instead of approve_post(post), we could equally easy have approve_posts(posts) accepting a set of posts, rather than one post.
Or, we could have a bundle_sold_products() behaviour that takes a subset of products bought by one client, and bundles them in one shipment. The idea is that you no longer need to worry whether a function is more naturally defined as a method on model A or model B, because there might be real-world examples that are neither.
Here’s an example:
def approve_post(post): post.status = Post.STATUS_APPROVED post.save() def publish_post(post): if post.status != Post.STATUS_APPROVED: raise PostInvalidStatusError() post.status = Post.STATUS_APPROVED post.save()
You’ve got two simple methods in front of you that capture different actions. In our case, it’s post approval and the business logic behind new post publication.
However, there are still some disadvantages in this approach.
First, the cognitive load involved in coming up with that code is considerable and the process may not be clear to everyone. The only advantage of that approach is taking a long file and cutting it into a number of medium/short files. It won’t force you to consider sub-optimal solutions – and that’s something I care about when putting business logic into code.
To sum up:
I think the idea isn’t that bad. It’s testable and, depending on how you do it, readable. However, that comes at a price: a lack of structure that comes with large code bases – there’s simply not much agreement right now about how to structure services in Django.
Idea #4: QuerySets/Managers
The main idea that I wanted to talk about during this talk is very simple:
I think that we’re mistaken in implicitly thinking about that we called actions/behaviours as functions of just one object.
I believe that to be a false and a very limiting assumption. I guess it stems from the object-oriented paradigm of which I’m a huge advocate, but is nonetheless false.
The inspiration for this idea came from my realisation that more often than not we end up implementing batch actions, be that an MVP for a startup or a new feature for a big legacy system.
So returning to our blog example, imagine that your corporate blog is now a success and more people start using your app. The supervisors request the possibility to approve posts in batches, rather than clicking the approve button on each individual.
You could implement that in a number of ways: you can just implement a MultipleObjectUpdateView, you can write a serializer for a set of objects, if your frontend talks to the backend through an API, you can use Django Admin actions… all of that doesn’t matter. What matters, is that implementing the methods on querysets/managers rather than Models would allow for more efficient queries. If you don’t, you either run into a lot of repeated code or you run unoptimised queries.
The idea is to implement the approve() and publish() methods on the queryset/manager, rather than on the Model and to treat the case of approving or publishing 1 blog post as a special case of approving or publishing n blog posts, simply.
The reason why I’m saying queryset/managers is that in Django you can easily get one from the other, e.g. defining a queryset but then calling the as_manager(). For other reasons, I think that queryset are a slightly better solution, but it’s a minor thing.
So here’s what it looks like:
class PostQuerySet(QuerySet): def approve(self): return self.update(status=Post.STATUS_APPROVED) def publish(self): return self.filter(status=Post.STATUS_APPROVED).update( status=Post.STATUS_PUBLISHED) class Post(Model): ... objects = PostQuerySet.as_manager()
Remember how I had some doubts about whether locating methods in models really brings us dry code? Well, I think in that in most projects – whether you’re working on building a startup or joining a large legacy system – sooner or later someone will come up with a request to enable batch actions. If you use the model approach, you can either:
- leave the code in the models and duplicate the code on the queryset level;
- loop over instances and save them individually.
Now, if you started working with queryset/managers, you’re covered right from the start.
Looking at the criteria I employed to evaluate previous approaches, this idea comes out pretty good because it’s:
- Testable – if you know how to test models and you know how to use querysets, you know how to test querysets, period;
- Readable – I believe that the readability is on a par with fat models: when I read a PostManager class, or a PostQuerySet class, I know I’m talking about things to do filtering and/or manipulating collections of Posts;
- DRY – that’s why I like the idea better than completely custom service classes. Querysets are used very often in a Django framework. You use querysets when limiting the scope of generic views, working with permissions, or writing Django admin actions! They’re everywhere which is why I strongly believe in this solution;
- Optimized – querysets are built with collections of objects in mind. Rather than looping over a set of Posts, setting each status and saving them individually, which would take n updates, you can issue just one query. This is possible in vast majority of data models, provided you use a relational database.
But no approach is perfect and this one has some disadvantages as well – it makes checking business rules a bit more complicated and integrations may become more difficult.
First, checking business logic when bearing only one Post in mind is simple: you either publish the post if it’s approved, or raise an exception if it’s still a draft. With sets of Posts, that becomes complicated, because you might get a set of objects with different statuses.
In principle, you will want to employ two general strategies:
- In some cases you will want to only update those objects that conform to the rule. For example, there is no harm in publishing only 5 posts even if a set of 10 was passed on a corporate blog. Just inform the user that you published 5 out of 10 posts they asked for, say they need to work on the remaining ones, and you’re good.
- In some cases, you will want to fail early if not all passed objects conform to the rule. For example, there is no point in booking the user’s journey if only 2 out of three 3 airplane tickets are available.
A general rule that I came up with is that case 1 applies to cases which only update the individual passed objects without modifying or creating a related object. For example, you only change the status field of n Posts but it does not affect anything else.
Case 2 will generally apply when the action modifies or creates some related objects. There is no point in compiling a subset of passed Posts if you are building an e-book out of them.
Still, your mileage will vary.
Second point – integrations. As I said previously, this approach works nicely with atomicity and transactions provided by relational databases. But what if you need to communicate with an external service?
Again, you might run into one of 3 general scenarios:
- If you’re a service with a mature, robust API that allows for batch actions, you’re in luck. Not only there is very little you need to do, but the queryset/manager approach will potentially save you not only from unoptimised DB queries, but also network requests.
A lot of APIs do do that. You can submit a batch of payments to PayPal. You can submit a batch of emails to Sendgrid or Mailgun, etc.
- If the external service does not accept batch actions but a rollback is possible and is not costly, then you can try and request the individual actions, and if all of them succeed, you can go ahead and update your DB in one go. If one of them fails, you can rollback the previous ones.
- You’re doomed. But at least you’re not doomed by the queryset approach. You basically default to calling the action in a loop for every object.
That sums up my take on different approaches to organising business logic code in Django projects.
While I have been successfully using the queryset approach in the projects that I work on, I’m genuinely interested about any shortcomings that I might have overlooked and your experiences in the matter. Let me know in the comments!
The post is based on the video presentation for our weekly tech talks. You will find it here.