Today I Learned: Understanding PostgreSQL Explain Query

My public personal notes to understand the PostgreSQL Explain Query Result.

Iman Tumorang
Easyread

--

Photo by benjamin lehman on Unsplash

Hi there!

*I was dumbfounded when I watched a video about DB optimization. I didn’t understand how to interpret the Postgres Explain query on that video till I tried to do my research and learn through the internet😢

I’m Iman Tumorang, a Software Engineer who always keeps forgetting things as I get older 😆. I know it’s normal for humans to get easy to forget things. That’s why I put this blog as my notes, so I can refer to them once I forget about Postgres Explain Query. Though I’m still young 😆, the more projects I work on, the more business domains I need to understand, and the more I will forget the older things easily. I have a limited mind budget to hold any important stuff.

The way my mind works now is like a machine: I give input (problem), and it will produce an output (solution). But…, similar to a machine/system, it may have limited hardware and can not store all the data inside the memory. That’s why we have to do data archival regularly. People call this Hot, Warm, and Cold storage.

So, this blog post is one of my cold storage for this topic. Whatever I just learned today, I can look it back and use it in the future for references.

Explain vs Explain Analyze Query in Postgres

In Postgres, we can identify how well our query performs. It provides a syntax to help us to identify our question.

>$ EXPLAIN SELECT * FROM transaction

|QUERY PLAN |
|--------------------------------------------------------------------------------|
|Limit (cost=0.00..0.44 rows=10 width=263) |
| -> Seq Scan on transaction (cost=0.00..38131350.49 rows=864697949 width=263)|
>$ EXPLAIN ANALYZE SELECT * FROM transaction

|QUERY PLAN |
|---------------------------------------------------------------------------------------------------------------------------|
|Limit (cost=0.00..0.44 rows=10 width=263) (actual time=0.011..0.013 rows=10 loops=1) |
| -> Seq Scan on transaction (cost=0.00..38131350.49 rows=864697949 width=263) (actual time=0.011..0.011 rows=10 loops=1)|
|Planning Time: 0.073 ms |
|Execution Time: 0.032 ms |

The differences between these two keywords:

  • EXPLAIN: will show the query plan and its execution costs.
  • EXPLAIN ANALYZE: collect the actual statistics for the query. It runs the actual SQL query. So, we must be careful when using this for DELETE / DROP query.

Interpret the Explain Analyze Query Result in Postgres

Since Explain Analyze gives more actual statistics, people usually use this for a more accurate result to analyze their runtime query. This part will explain how to interpret the analysis result from Postgres — based on what I understood (please correct me if I’m wrong in the comment section).

Okay, I don’t want to make whole examples since these are my notes. I’ll give examples of Explain query results and interpret them accordingly.

Here’s an example of my ERD.

Example of the ERD

The interactive version can be seen here: https://dbdiagram.io/d/itversity_retail_db-653147a2ffbf5169f00a8d28

Usecase #1: Get Revenue Query

EXPLAIN
SELECT o.*,
round(sum(oi.order_item_subtotal)::numeric, 2) as revenue
FROM orders as o
JOIN order_items as oi
ON o.order_id = oi.order_item_order_id
WHERE o.order_id = 2
GROUP BY o.order_id, o.order_date, o.order_customer_id, o.order_status;

Let’s say I have a query like I wrote above. The goal is to aggregate of the total revenue and group it by order_id, order_date, customer_id, and the status of the orders.

Expected query result:

|order_id|order_date             |order_customer_id|order_status   |revenue|
|--------|-----------------------|-----------------|---------------|-------|
|2 |2013-07-25 00:00:00.000|256 |PENDING_PAYMENT|579.98 |

The EXPLAIN query result will be something like this:

|QUERY PLAN                                                                             |
|---------------------------------------------------------------------------------------|
|GroupAggregate (cost=0.29..3427.86 rows=1 width=58) |
| Group Key: o.order_id |
| -> Nested Loop (cost=0.29..3427.82 rows=4 width=34) |
| -> Index Scan using orders_pkey on orders o (cost=0.29..8.31 rows=1 width=26)|
| Index Cond: (order_id = 2) |
| -> Seq Scan on order_items oi (cost=0.00..3419.47 rows=4 width=12) |
| Filter: (order_item_order_id = 2) | |

In EXPLAIN query result, there are three components: root, branches, and leaves. Based on the result above,

  • The low level is leaves (e.g., Index Scan, Seq Scan )
  • Nested Loop” => is the branch. It has two leaves.
  • GroupAggregate” is the root

Explanation:

|GroupAggregate  (cost=0.71..16.88 rows=1 width=58)                                                                   |
| Group Key: o.order_id |
|
  • First, it will check from the root (GroupAggregate), and check if it has branches/leaves. If yes, it will go deeper down to branches/leaves.

— — — — — — — — — — — — — —

|  ->  Nested Loop  (cost=0.29..3427.82 rows=4 width=34)                                |
| -> Index Scan using orders_pkey on orders o (cost=0.29..8.31 rows=1 width=26)|
| Index Cond: (order_id = 2) |
| -> Seq Scan on order_items oi (cost=0.00..3419.47 rows=4 width=12) |
| Filter: (order_item_order_id = 2)
  • On the branches of the root, it will go to Nested Loop. Since it’s Nested Loop join, it will first search from the table orders using Index Scan, (first leaf), and for each row of orders, it will search order_items in the inner loop using Seq Scan (second leaf). That’s why it says Nested loop. More examples of Nested loops can be read in this article.
  • So, it will execute the deepest leaves/branches first. In this case, it will run the two leaves sequentially; first, it will execute the Index Scan and then Seq Scan word subsequently.
  • And this Nested Loop results will be aggregated using the group by query.
Flow chart for the explain query execution.

Link for the Mermaid

Usecase #2: Count the total orders by the given the customer ID

EXPLAIN ANALYZE SELECT count(*)
FROM orders AS o
JOIN order_items as oi
ON o.order_id = oi.order_item_order_id
WHERE o.order_customer_id = 5;

Another query example, as we can see above, this query will count the total orders for the given customer_id. The EXPLAIN ANALYZE result will be something like this below.

|QUERY PLAN                                                                                                                                                         |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|Aggregate (cost=53.71..53.72 rows=1 width=8) (actual time=2.434..2.435 rows=1 loops=1) |
| -> Nested Loop (cost=4.76..53.67 rows=15 width=0) (actual time=1.872..2.428 rows=7 loops=1) |
| -> Bitmap Heap Scan on orders o (cost=4.34..26.49 rows=6 width=4) (actual time=1.034..1.315 rows=4 loops=1) |
| Recheck Cond: (order_customer_id = 5) |
| Heap Blocks: exact=4 |
| -> Bitmap Index Scan on orders_order_customer_id_idx (cost=0.00..4.34 rows=6 width=0) (actual time=0.696..0.696 rows=4 loops=1) |
| Index Cond: (order_customer_id = 5) |
| -> Index Only Scan using order_items_order_item_order_id_idx on order_items oi (cost=0.42..4.49 rows=4 width=4) (actual time=0.275..0.276 rows=2 loops=4)|
| Index Cond: (order_item_order_id = o.order_id) |
| Heap Fetches: 0 |
|Planning Time: 5.706 ms |
|Execution Time: 2.565 ms |

Since we’re using EXPLAIN ANALYZE the result is more granular. It also provides the actual statistics for the execution.

Explanation:


|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|Aggregate (cost=53.71..53.72 rows=1 width=8) (actual time=2.434..2.435 rows=1 loops=1) |
|
  • So, here we have a root called Aggregate, it will aggregate all the results once the inner branches/leaves are executed. Since the inner branches/leaves still need to be executed, it will go deeper and execute the branches/leaves.

— — — — — — — — — — — — — —

|  ->  Nested Loop  (cost=4.76..53.67 rows=15 width=0) (actual time=1.872..2.428 rows=7 loops=1)                                                                    |
| -> Bitmap Heap Scan on orders o (cost=4.34..26.49 rows=6 width=4) (actual time=1.034..1.315 rows=4 loops=1) |
| Recheck Cond: (order_customer_id = 5) |
| Heap Blocks: exact=4 |
| -> Bitmap Index Scan on orders_order_customer_id_idx (cost=0.00..4.34 rows=6 width=0) (actual time=0.696..0.696 rows=4 loops=1) |
| Index Cond: (order_customer_id = 5) |
| -> Index Only Scan using order_items_order_item_order_id_idx on order_items oi (cost=0.42..4.49 rows=4 width=4) (actual time=0.275..0.276 rows=2 loops=4)|
| Index Cond: (order_item_order_id = o.order_id) |
| Heap Fetches: 0 |
  • On a deeper level from the root, we found Nested Loop a branch from the root with two leaves. Same explanation with the previous example, since it’s using Nested Loop it will loop to the first branches/leaves first (Bitmap Heap Scan), on orders table. For each row of the outer loop, it will do an inner loop to check the matching data towards order_items table using Index Only Scan.

— — — — — — — — — — — — — —

|        ->  Bitmap Heap Scan on orders o  (cost=4.34..26.49 rows=6 width=4) (actual time=1.034..1.315 rows=4 loops=1)                                              |
| Recheck Cond: (order_customer_id = 5) |
| Heap Blocks: exact=4 |
| -> Bitmap Index Scan on orders_order_customer_id_idx (cost=0.00..4.34 rows=6 width=0) (actual time=0.696..0.696 rows=4 loops=1) |
| Index Cond: (order_customer_id = 5) |
  • BUT, since before the outer loop started, in Bitmap Heap Scan it still has another leaf (Bitmap Index Scan), it will prioritize the first branches first before other branches (Index Only Scan).
  • So it goes to Bitmap Index Scan first, means it will scan the index first and form a bitmap of which pages need to be pulled data from that will happen in the upper level (Bitmap Heap Scan)
  • Once the Bitmap Index Scan done, on the upper level, Bitmap Heap Scan it means instead of looking for the entire disk, it will only look through the bitmap disk. The result is mapped in memory based on the result of the Bitmap Index Scan. References from StackOverflow

— — — — — — — — — — — — — —

|        ->  Index Only Scan using order_items_order_item_order_id_idx on order_items oi  (cost=0.42..4.49 rows=4 width=4) (actual time=0.275..0.276 rows=2 loops=4)|
| Index Cond: (order_item_order_id = o.order_id) |
| Heap Fetches: 0 |
  • Now, since the Bitmap Heap Scan is done, it will loop for each row from the result (since on the top, it’s using Nested Loop) and has another inner loop to check to order_items table using the Index Only Scan branches.
  • Once all is mapped, the next step is to aggregate (count) the total rows for the given conditions.

Usecase #3: Get Revenue Per Category

Next, we have a more complex query example below. The goal is to display the result to get revenue per item category:

Without further ado, let’s investigate the query:

EXPLAIN ANALYZE SELECT c.*,
coalesce(round(sum(oi.order_item_subtotal)::numeric, 2), 0) as category_revenue
FROM categories as c
LEFT JOIN products as p
ON c.category_id = p.product_category_id
LEFT JOIN order_items as oi
ON p.product_id = oi.order_item_product_id
LEFT JOIN orders as o
ON o.order_id = oi.order_item_order_id
AND o.order_status IN ('COMPLETE', 'CLOSED')
AND to_char(o.order_date::date, 'yyyy-MM') = '2014-01'
GROUP BY c.category_id, c.category_department_id, c.category_name
ORDER BY c.category_id ASC;

Result:

|QUERY PLAN                                                                                                                                   |
|---------------------------------------------------------------------------------------------------------------------------------------------|
|Sort (cost=4851.60..4851.74 rows=58 width=52) (actual time=84.960..84.963 rows=58 loops=1) |
| Sort Key: c.category_id |
| Sort Method: quicksort Memory: 29kB |
| -> HashAggregate (cost=4849.03..4849.90 rows=58 width=52) (actual time=84.913..84.948 rows=58 loops=1) |
| Group Key: c.category_id |
| Batches: 1 Memory Usage: 32kB |
| -> Hash Right Join (cost=60.57..3988.04 rows=172198 width=28) (actual time=0.926..65.128 rows=173423 loops=1) |
| Hash Cond: (p.product_category_id = c.category_id) |
| -> Hash Right Join (cost=58.26..3500.69 rows=172198 width=12) (actual time=0.881..39.583 rows=173443 loops=1) |
| Hash Cond: (oi.order_item_product_id = p.product_id) |
| -> Seq Scan on order_items oi (cost=0.00..2988.98 rows=172198 width=16) (actual time=0.005..10.549 rows=172198 loops=1)|
| -> Hash (cost=41.45..41.45 rows=1345 width=8) (actual time=0.868..0.868 rows=1345 loops=1) |
| Buckets: 2048 Batches: 1 Memory Usage: 69kB |
| -> Seq Scan on products p (cost=0.00..41.45 rows=1345 width=8) (actual time=0.005..0.483 rows=1345 loops=1) |
| -> Hash (cost=1.58..1.58 rows=58 width=20) (actual time=0.039..0.039 rows=58 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 11kB |
| -> Seq Scan on categories c (cost=0.00..1.58 rows=58 width=20) (actual time=0.009..0.019 rows=58 loops=1) |
|Planning Time: 0.430 ms |
|Execution Time: 85.027 ms |
|

Explanation:

  • Like all previous examples, try to look at the deepest one first, starting from the root. If it has branches/leaves, look deeper.
|Sort  (cost=4851.60..4851.74 rows=58 width=52) (actual time=84.960..84.963 rows=58 loops=1)                                                  |
| Sort Key: c.category_id |
| Sort Method: quicksort Memory: 29kB
  • This root has one leaf: HashAggregate, so we will look into that block first.

— — — — — — — — — — — — — —

|  ->  HashAggregate  (cost=4849.03..4849.90 rows=58 width=52) (actual time=84.913..84.948 rows=58 loops=1)                                   |
| Group Key: c.category_id |
| Batches: 1 Memory Usage: 32kB |
  • This block also turns out to have one branch: Hash Right Join. Look deeper again.

— — — — — — — — — — — — — —

|        ->  Hash Right Join  (cost=60.57..3988.04 rows=172198 width=28) (actual time=0.926..65.128 rows=173423 loops=1)                      |
| Hash Cond: (p.product_category_id = c.category_id) |
| -> Hash Right Join (cost=58.26..3500.69 rows=172198 width=12) (actual time=0.881..39.583 rows=173443 loops=1) |
| Hash Cond: (oi.order_item_product_id = p.product_id) |
| -> Seq Scan on order_items oi (cost=0.00..2988.98 rows=172198 width=16) (actual time=0.005..10.549 rows=172198 loops=1)|
| -> Hash (cost=41.45..41.45 rows=1345 width=8) (actual time=0.868..0.868 rows=1345 loops=1) |
| Buckets: 2048 Batches: 1 Memory Usage: 69kB |
| -> Seq Scan on products p (cost=0.00..41.45 rows=1345 width=8) (actual time=0.005..0.483 rows=1345 loops=1) |
| -> Hash (cost=1.58..1.58 rows=58 width=20) (actual time=0.039..0.039 rows=58 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 11kB |
| -> Seq Scan on categories c (cost=0.00..1.58 rows=58 width=20) (actual time=0.009..0.019 rows=58 loops=1) |
  • On this part Hash Right Join: it will join between 2 branches/leaves results using the Hash Join algorithm. And because of that, we will need to look at the branches/leaves first.

Reference for “fun fact” factbox and join strategies

— — — — — — — — — — — — — —

|              ->  Hash Right Join  (cost=58.26..3500.69 rows=172198 width=12) (actual time=0.881..39.583 rows=173443 loops=1)                |
| Hash Cond: (oi.order_item_product_id = p.product_id) |
| -> Seq Scan on order_items oi (cost=0.00..2988.98 rows=172198 width=16) (actual time=0.005..10.549 rows=172198 loops=1)|
| -> Hash (cost=41.45..41.45 rows=1345 width=8) (actual time=0.868..0.868 rows=1345 loops=1) |
| Buckets: 2048 Batches: 1 Memory Usage: 69kB |
| -> Seq Scan on products p (cost=0.00..41.45 rows=1345 width=8) (actual time=0.005..0.483 rows=1345 loops=1) |
|
  • Next, we will find another branch : Hash Right Join (cost=58.26..3500.69. This branch also has two more leaves. And this branch also has 2 children. One of them is a leaf: Seq Scan on order_items oi and the other one is a branch: Hash (cost=41.45..41.45 . Since this branch has another branch, then we will need to wait for the children's branch result to accumulate so that it will be executed.
|                    ->  Hash  (cost=41.45..41.45 rows=1345 width=8) (actual time=0.868..0.868 rows=1345 loops=1)                             |
| Buckets: 2048 Batches: 1 Memory Usage: 69kB |
| -> Seq Scan on products p (cost=0.00..41.45 rows=1345 width=8) (actual time=0.005..0.483 rows=1345 loops=1) |
  • In this part, it will run the leaf query, Seq Scan on products p. It will build a hash table based on the table products using sequential scanning. Once it is completed, it will bring the result to the top level.
  • And it will repeated until all plans are executed. The final execution is on the root level: Sort (cost=4851.60..4851.74 rows=58 width=52)

Summary

I think that’s just a few examples of interpreting the EXPLAIN ANALYZE from Postgres. There are three components: root, branches, and leaves. How to identify them?

  • Root: it’s the first node. The starting point.
  • Branches: usually have branches inside the node. If using join, usually identified with Nested Loop, Hash Join, or Merge Join. It means it will combine the branches/leaves result first before it’s executed.
  • Leaves: usually standalone. It doesn’t have any other leaves/branch nodes. Can be identified with common keywords like, Seq Scan , Index Scan , etc.

To read the query EXPLAIN result, we can start from the root and check if it has any branches/leaves. If it has, continue to dive deeper until it only leaves left. It might be different from how others do it. What I wrote here is the way I read the EXPLAIN result. At least, for me, it works. Let me know in the comment if you have a better idea to do this.

--

--

Software Architect @ Xendit | Independent Software Architect @ SoftwareArchitect.ID | Reach me at https://imantumorang.com for fast response :)