This is the third and last post in our series about data warehouses. If you missed the previous ones, check them out here:

Data warehouses – what they are and how to classify them (Part 1)

Data warehouses – what they are and how to classify them (Part 2)

In our series, we have showed you:

What data warehouses are
Who needs a data warehouse
Tips about implementing and designing data warehouses

But bow to make sure that your data warehouse is safe? Here’s the answer.

Security in data warehouses

They say there are two types of people dealing with databases: those who do backups and those who will do them. This well-known phrase is still circulating because it’s actually quite accurate. One of the most significant aspects of data warehouses that often gets overlooked or downplayed is the management of historical data and backups.
In this part of our series about data warehouses, I wanted to share some security essentials.
But before I explain how to take care of security in data warehouses, there’s one thing we need to make clear first: never delete anything permanently. Just don’t. In theory, everyone who has some experience with Big Data knows that, but you may come across cases where reconstructing the current state of the database or analyzing the process of data creation is impossible.

Data warehousing disasters

One of the most common scenarios that can lead to a disaster is a loss of some or all of the data. That might happen for many different reasons, from a disk array failure to a poorly designed functionality or an error made by a developer.
It may also happen that data in our warehouse loses coherence. The recovery of data integrity without any historical data, a changelog, or at least a backup will be challenging – in some cases it might not even be possible!
The data we store in our warehouse is going to be constantly updated. In time, we might find ourselves in need of historical data that no longer exists. When creating a data warehouse, remember that the data stored there may be used in the future for other purposes than those we planned initially.
It may happen that we want to generate analytical data showing the changes that took place over several months – if we don’t record information about such changes, this won’t be possible.
Consider the following examples:

We have a website that manages sports bets and we would like to check how users behave in relation to the information published about a team or player.
Another example is where we need to check the details of invoices issued during a given moth or when it turns out that something isn’t right in our quarterly settlement.
It might happen that we need information about users who logged into our system to verify and resolve any ambiguities that may arise because of their activity.

So how do we protect our data warehouses?

There are basically two good solutions you can choose from when ensuring the safety of data stored in your data warehouse.

Regular backups (Dumps)

Backup copies are something that should be part of the definition of a data warehouse. Depending on how often our database is updated, we should make automatic backups. It’s common to perform a database dump once a day and kept for several months. But in some situations, we may need to perform it more often and keep backups for a more extended period. There are many tools available for creating and storing database dumps – most of them contain compression features that help to save space on the hard drive.


In addition to regular backups, it’s a good idea to keep a record of changes for individual atomic information. That allows making corrections to our system and then reproducing the current state of the database with changes taken into account. The downside of this solution is the need to store quite a large amount of data redundancy. Also, the implementation of such a mechanism is not easy, so before we decide to create a changelog, we should ask ourselves whether we really need it.
I hope this guide to data warehousing helps you to make the right decision regarding the methods for storing data at your organization.

Have you got any questions about data warehouses? Give us a shout out in the comments; we’re looking forward to hearing from you.
Got a data science project on your mind? Hire us to get top expertise in the field.

Data science Web development

Single Responsibility Principle in React applications – Part 2

This is the second part of my blog post series about dealing with React components that have too many responsibilities. Have a look at the first part of the [...]

Data science Python

Developing a recommendation system? Here’s why you should choose Python

It’s safe to say that recommendation systems (also called recommendation engines) have become one of the most essential elements guiding users throughout the web. Consider websites like Netflix: The [...]

Get insights from software experts.

Almost finished…

But we need to confirm your email address first.

To complete the subscription process, please click the link in the email we’ve just sent you.

Sunscrapers Sp. z o.o.

ul. Pokorna 2/947

Warsaw 00-199


Add us to your address book

Thanks for subscribing!

Your email address already exists in our database.

Every month, you’ll get a portion of insights about tech trends, best practices in building software, and managing tech teams. You’ll hear from us soon.

Scroll to bottom

Hi there, we use cookies to provide you with an amazing experience on our site. If you continue without changing the settings, we'll assume that you're happy to receive all cookies on the Sunscrapers website. You can change your cookie settings at any time.

Learn more

Learn how to create a REST API for Django projects !

Build a functional REST API with the Django REST Framework

Download ebook No, thank you
Rest API eBook