Why do column oriented databases prefer Delete-Insert over Update?

You may have seen that some databases prefer Delete and Insert rather than Update operation. Let's find out why.

Let me make this clear beforehand - UPDATE is not a cheap operation in most of the computer systems. Hence, finding a way around UPDATE operation gives a significant amount of performance boost in computing systems.

Let's just start out with an example of the following table -

IDAgeDepartment
125Computer
221Computer
326Maths
423Computer
520Biology
623Computer
720Biology
823Computer

Why Columnar DBs

Lets try to think of a way to optimize the data stored here. As you can see, the values of columns are quite often repeated. for example, the column "Department" will have the repeated value of "Computer" way too many times.

Column Oriented database solve exactly this problem. They optimize columns rather than rows. Meaning that instead of a row, a column will stored in a continuous chunk of (secondary)memory. This gives us an advantage while running analytical queries for example, OLAP (Online Analytical Processing) software usually needs analytical data such as "How many people are there in the Computer Department?" rather than what traditional software needs - "Tell me everything we know about User ID 4".

Simple Table

Here, each row takes 16 Bytes. And assuming we have a memory which has block size of 64 Bytes, this is how the data will be distributed across blocks in memory - Row Memory Table

Now imagine that we need calculate Average Age of all the Persons. The DB Engine will have to first load the first block in the primary memory, get the sum of all its values and then get the 2nd block and then get its value. Which means it will take 2 I/O operations in total.

Now lets store the data in columnar fashion. Row Memory Table

Here, if we again want to computer average age, we will have to do only one I/O operation.

Ease Of Compression

Due to the columnar structure, we have a lot of similar values inside a continuous memory block. Looking at the "Department" column, we can say that there will only be 5-6 distinct values in this entire column. Due to this trait, it gets massively easy to compress this data while storing. While compressing this data, some constraints are introduced -

  1. The compression has to be lossless.
  2. Since Columnar DBs are READ-optimized, it should be significantly faster to de-compress the data. Faster that compressing if possible.
  1. The length of each compressed word has to be uniform. So that we can directly get to a value using the formula startOf(A) + i * width(A). More on this later.

There is an algorithm that satisfies all these constraints, its called Run Length Encoding.

Run Length Encoding (RLE)

Run Length Encoding is a lossless compression algorithm with a brute-force simplicity.

The idea is to replace consecutive occurrences of a given symbol with only one copy of the symbol, plus a count of how many times that symbol occurs—hence, the name run length.

To explain the algorithm in a simple way, lets take a string - AAAABBBCDEEEE When encoded using RLE, the output will be - 4A3BCD4E. You can see that it just stores how many times a letter occurred in front of it.

In columnar databases, while storing data on the disk, it is preferred to store with one of the columns sorted in order, such that read operations become faster using indexing. The RLE technique here is useful because, a column usually has a lot of repeated values. Lets see our example only - Sort and Encode

This saves space, but also lets us reach to each row using offset.

Why Delete-Insert rather than Update?

Lets suppose that an UPDATE operation is to be done on the 3rd row, updating the field from "Computer" to "Physics" 1 - Sort and Encode

This totally messed up the compression 2.

Now look at the Delete-Insert operation on the 3rd row -

Sort and Encode

Here, the compression remains intact and the new record is actually stored in a Write-optimized store.

Write-Optimized Store

In a database that is needed for OLAP, a WRITE/UPDATE operation is needed less often than a READ operation. Which is why, many Columnar databases prefer to keep rows that have been written to in a separate, uncompressed table. That table is optimized for WRITEs. Periodically this WRITE-optimized store is flushed and the values are actually materialized in the READ-optimized store. When it flushes, it will have to take the entire WRITE-optimized store and update the respective values in READ-optimized store, and then again re-arrange the value in the READ-optimized store to speed up READ operations. Since this operation is complex and expensive, the DB engine avoids doing it on runtime and keeps the boat afloat.

This marks the end of this very short introductory article. I urge you to take a look at the Further Reading section to get deeper into the concepts. I will be trying my best to expand upon the concepts that I have left untouched in this article.

Further Reading

  1. Héman, Sándor & Zukowski, Marcin & Nes, Niels & Sidirourgos, Lefteris & Boncz, Peter. (2010). Positional update handling in column stores. IEEE Computer Graphics and Applications - CGA. 543-554. 10.1145/1807167.1807227.

  2. Wikipedia - Run Length Encoding

  3. Daniel Abadi; Peter Boncz; Stavros Harizopoulos; Stratos Idreaos; Samuel Madden, The Design and Implementation of Modern Column-Oriented Database Systems , now, 2013, doi: 10.1561/1900000024.

Footnotes

  1. Homage to Geoffrey Hinton, a Computer Scientist, who won Nobel Prize in Physics, for his work in Machine Learning & AI on 9th October, 2024.

  2. I have intentionally left out Soft Deletes, where a separate "deletedAt" column is maintained to track what was deleted. Future articles will cover this.

🗨️ You can just talk to this blog!

Interact with content in a whole new way using custom prompts.

Prefer other platforms? Copy the prompt →
Copy Prompt