When ORDER BY and NULL work together

·

3 min read

When we would like to sort the result-set in ascending or descending, we can use ORDER BY in SQL query. In Rails, we can use order which is derived from ActiveRecord::QueryMethods. For example,

Article.all.order(published_at: :desc)

Let's focus on the published_at attribute now.

What will happen if one of the article records' published_at value is nil and we are going to sort the result-set in descending?

$ article1 = Article.create(title: 'A1', published_at: nil)
$ article2 = Article.create(title: 'A2', published_at: Time.zone.now)
$ article3 = Article.create(title: 'A3', published_at: Time.zone.now - 2.days)

According to the query below, we could find out the result in order was article1, article2, article3. article1 whose published_at is nil was placed at the head of the result.

# Database: PostgreSQL
$ Article.where(id: [1,2,3]).order(published_at: :desc)
  Article Load (1.9ms)  SELECT "articles".* FROM "articles" WHERE "articles"."id" IN ($1, $2, $3) ORDER BY "articles"."published_at" DESC LIMIT $4  [["id", 1], ["id", 2], ["id", 3], ["LIMIT", 11]]

 => #<ActiveRecord::Relation [
 #<Article id: 1, title: "A1", content: "", status: "publish", created_at: "2021-02-16 13:40:58", updated_at: "2021-02-28 03:35:34", published_at: nil>, 
 #<Article id: 2, title: "A2", content: "", status: "publish", created_at: "2021-02-16 14:08:34", updated_at: "2021-09-01 16:19:30", published_at: "2021-09-01 16:19:27">, 
 #<Article id: 3, title: "A3", content: "", status: "publish", created_at: "2021-02-16 14:13:15", updated_at: "2021-09-01 16:19:48", published_at: "2021-08-30 16:19:45">
 ]>

I'm curious why NULL values come first when using ORDER BY.

If you apply the ORDER BY clause to a column with NULLs, the NULL values will be placed either first or last in the result set. The output depends on the database type. (Source)

In my project I use PostgreSQL, by default, PostgreSQL considers NULL values larger than any non-NULL value.

That's why the record whose published_at == nil came first when using ORDER BY in descending order.


How can I do if I would like to "Order by Column1 if Column1 is not null, otherwise order by Column2"?

For instance, when published_at == nil , I hope to replace published_at with created_at to be ordered.

PostgreSQL provides a wonderful function - COALESCE. It will return the first non-NULL argument. Let's jump into PostgreSQL to take a look at how it executes:

$ psql postgres
$ postgres=# \c myblog_dev
$ myblog_dev=# SELECT COALESCE (1, 2);
 coalesce
----------
        1
(1 row)
#=> `1` is the first non-NULL argument 

$ myblog_dev=# SELECT COALESCE (null, 2, 1);
 coalesce
----------
        2
(1 row)
#=> `2` is the first non-NULL argument

$ myblog_dev=# SELECT COALESCE (null, null, null);
 coalesce
----------

(1 row)
#=> There is no any non-NULL argument.

The SQL statement:

SELECT * FROM articles ORDER BY COALESCE(published_at, created_at) DESC;

COALESCE can be used in ActiveRecord Query Interface, too:

Article.where(id: [1,2,3]).order('COALESCE(published_at, created_At) DESC')
#=> Article Load (1.0ms)  SELECT "articles".* FROM "articles" WHERE "articles"."id" IN ($1, $2, $3) ORDER BY COALESCE(published_at, created_At) DESC LIMIT $4  [["id", 1], ["id", 2], ["id", 3], ["LIMIT", 11]]
 => #<ActiveRecord::Relation [
 #<Article id: 2, title: "A2", content: "", status: "publish", created_at: "2021-02-16 14:08:34", updated_at: "2021-09-01 16:19:30", published_at: "2021-09-01 16:19:27">,
 #<Article id: 3, title: "A3", content: "", status: "publish", created_at: "2021-02-16 14:13:15", updated_at: "2021-09-01 16:19:48", published_at: "2021-08-30 16:19:45">, 
 #<Article id: 1, title: "A1", content: "", status: "publish", created_at: "2021-02-16 13:40:58", updated_at: "2021-02-28 03:35:34", published_at: nil>]>

Whether we use COALESCE in SQL statement or by ActiveRecord Query Interface above, we can get the sorting result as we expected!

  • article2 (published_at: 2021-09-01 16:19:27) came first

  • article3(published_at: 2021-08-30 16:19:45)

  • article1(created_at: 2021-02-16 14:08:34) used created_at to order by


Reference