Working correctly with database indexes is always the key to application performance and reliability. Especially when the amount of data grows rapidly, well designed database architecture really pays off. However, when we’re working with database indexes, it can be easy to miss something, or fall into the trap of a tricky outlying case. One of these is when performing unique indexing on multiple columns, where at least one of them is null.
Consider the following scenario: You have a database table named `products` which works with a custom implementation of soft delete mechanism, so it includes columns: `name` and `deleted_at`. Deleting the product from the application doesn’t remove it permanently from the database, but instead marks it as deleted by populating the `deleted_at` column with a timestamp.
The system is designed to disallow creating product duplicates (by `products.name`), unless it was already existing before and has been deleted (so there may be many deleted products with the same `name`, but different `deleted_at` timestamp).
On first glance, you might think to secure this case at the database level you could create a unique index on the combination of two columns:
CREATE UNIQUE INDEX products_name_deleted_at_idx
ON products (name, deleted_at);
Or, using the ActiveRecord method in Rails DB migration:
add_index :products, [:name, :deleted_at], unique: true
However, this approach simply won’t work. The reason is that the PostgreSQL database doesn’t consider two NULL values as identical. So, for the index above these two cases are different and unique:
1: name = 'Sample name', deleted_at = NULL
2: name = 'Sample name', deleted_at = NULL
In practice that means that our index will work fine only for deleted products, but it will allow creating multiple products with the same `name` and empty `deleted_at` value - in effect, having the same products listed simultaneously, which we want to avoid.
There is one proper way to handle this situation and it requires using a partial unique index in PostgreSQL. The idea is basically to add another index with specified `WHERE` clause. In our case that will be:
CREATE UNIQUE INDEX products_name_deleted_at_null_idx
ON products (name)
WHERE deleted_at IS NULL;
This index will cover all the cases where the previous one (`products_name_deleted_at_idx`) didn’t work. Depending on the application assumptions it may even make sense to delete the first index and leave just the partial one.
It is worth to mention that Rails allows creating partial unique indexes the easy way, using standard ActiveRecord helpers:
add_index :products, :name, unique: true, where: "deleted_at is null"
Combining the commands will make sure that we have covered all cases.
We love getting into the nitty gritty of SQL coding and making sure we’ve outlined all cases correctly. If you are looking for assistance with your databases, back-end, or even DevOps, make sure to let the team at iRonin know - we’d be happy to help out!