Column Performance, BRIN Win, Unique and Null, Parallel Distinct | Scaling Postgres 224
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss how too many columns in a table can affect performance, at what point BRIN indexes win over btree, and Postgres 15 supporting unique nulls and parallel distinct queries.
Content Discussed
- Column order in PostgreSQL does matter
- Postgres Indexing: When Does BRIN Win?
- Postgres 15 improves UNIQUE and NULL
- Introducing PostgreSQL 15: Working with DISTINCT
- 5mins of Postgres E26: New in Postgres 15 - NULL handling in UNIQUE constraints & Parallel DISTINCT
- PostgreSQL 14 Internals
- Pg-agent – a Postgres exporter for Prometheus focusing on query performance statistics
- State of PostgreSQL: How to Contribute to PostgreSQL and the Community
- My Notes on GitLab’s Postgres Schema Design
- Patroni and pgBackRest combined
- Charly Batista
- Rubber Duck Dev Show Episode 50 | Brittany Martin From The Ruby On Rails Podcast
YouTube Video
Podcast Audio
Transcript
All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "COLUMN ORDER IN POSTGRESQL DOES MATTER". This is from cybertec-postgresql.com and it's talking about column order. But more specifically, I think it's talking about when you have too many columns, it can really have an impact on performance in certain cases. Now, the order comes in when you have a lot of columns on the table and you're looking for something specific in one of the later columns. But we'll get into it in this post. So basically, they had some clients that had some very broad tables. So he simulated one. He's using GENERATE SERIES and the \gexec command to create a really wide table, essentially 1,500 columns. So that's a lot of columns. And then he inserted a million rows into this table for doing these tests. Now, the size of the table was about 4GB, so not incredibly large.
Now, first, he just wanted to get a count of all the rows from this particular table and it returned in about 400 milliseconds. So there's no index. It's just taking account of the rows that exist there. It's doing it in a parallel sequential scan here. Now, when you choose the first column and do the same thing, you have something similar. It's a little bit slower because it does need to check for NULLs for the data in this column. So as opposed to being 416 milliseconds, this one was 432 milliseconds, but still not bad. But look what happens if you select the hundredth column where you're doing a count on that. Now suddenly it's twice as slow. It's 857 milliseconds. So I was pretty surprised it gets that drastic at only 100 columns. Then when it goes out to the thousandth column, it takes over eight and a half seconds to return a count from those rows.
So that's crazy. That's 20 times slower than before. Now, he says why this is happening is that, basically, he has to pull back a row and then use the data types of that table to figure out where the data is located within that row to be able to return it. Now, this can be relatively easy to do if there's a bunch of integers because they're essentially just 4 bytes. But with text-based ones like variable character data types, the computation gets a little bit long. And having to do that for every row that you're counting really adds up. So basically, as he says here, having that many columns in a table quote "...does create some non-trivial overhead". So just be careful about making your tables too wide because you may run into performance issues. But definitely check out this blog post if you want to learn more.
Next piece of content- "Postgres Indexing: When Does BRIN Win?". This is from crunchydata.com and he's talking about BRIN indexes which are basically a block range index. As the name implies, it indexes ranges of values. Unlike a B-tree index, the typical index that indexes every value, a BRIN index does not do that, it does ranges of values. Now as a consequence, it's going to be a lot smaller, this type of index. One of the disadvantages you can imagine is that it's harder to pull out a distinct value, whereas with a B-tree it's quite easy to pull out a distinct value, whereas with a BRIN you have to pull back the range and then find the individual value you're looking for. Now the other consequence is that it does really well with sequential data, but not so great with random data. So he has some examples here.
If you have a log table with a timestamp and you put a BRIN index on the timestamp where you're pulling back ranges of data, that could be a good use case, different types of sensor measurements, or GPS tracking points, et cetera. Now he goes into describing a little bit about the BRIN index and how it's structured. But basically, there's a value range that the index holds and tells you where in the pages to find those values. So again, the more that your values are grouped together physically in the heap, the more efficient a BRIN index will be because you don't want it to have to go to multiple pages across the heap to find a particular range because then the index becomes very ineffective. Now, he created a table here where he had some random data and some sequential data and a value and he created four different indexes— two B-tree and two BRIN and for each type— he did the random column and the sequential column on the table.
Now again, because the BRIN index is only ranges and not distinct values, you can tell how much smaller the BRIN is compared to the B-tree. Essentially it's 1000 times smaller in the example here. So the BRIN random size is 24 KB whereas the B-tree random size is 21 megabytes. So it's 1,000 times smaller, which is phenomenal. But let's look at some of the performances. So I think this was a million row table and he's pulling back different numbers of rows from that table. So 100 rows, 1,000, 10,000, 10,0000 rows. So how many rows you're pulling back from the table. Now the B-tree, whether it's random or sequential, the timings are pretty much similar but the sequential is more performant by maybe a twofold difference in some of the cases.
But it's pretty similar, but no more of a difference than that. So B-tree, whether random or sequential, the sequential is a little bit faster but that's pretty much the result. Now, the random BRIN index is just terrible. Its performance is worse than any of the B-tree and it's consistently worse because it's essentially just like doing a sequential scan, it doesn't help at all. Now the BRIN sequential scan, as noted before, because of its indexing ranges, can't pull out distinct values as fast. So actually the B-tree is about 20 times faster when pulling back 100 rows versus the BRIN index. At 1000 rows, the B-tree is only about twice to four times as fast and it's at the 10,000 rows you're trying to return mark, where the BRIN is pretty much in the pack and matching what the B-tree does. And then it starts performing better with row counts above 10,000, like the 100,000 it beat both B-tree examples.
And then keep in mind this is with an index that's 1,000 times smaller than a B-tree. So given the size, these are great results. And you can tweak the number of pages in a given range in the BRIN index. By default, it's 128, but he did test down to four different performance levels. So maybe by tweaking it, you could get a doubling at most, maybe a tripling in some performance for low row count levels. But you can check out this table for yourself. So basically, in conclusion, BRIN indexes are great when you have an insert-only pattern and the data is being added to the heap in a sequential manner. So think of a timestamp and they're great for very large tables where you want to pull back large amounts of rows. If you want to pull back a single row or ten rows out of a million a billion rows, the B-tree is going to be much more efficient at doing that. But if you're pulling back like a year's worth of data, or a month or a week out of a table with millions billions of rows, then a BRIN could give a lot of advantages. But if we want to learn more, definitely check out this blog post.
The next piece of content- "POSTGRES 15 IMPROVES UNIQUE AND NULL". This is from rustprooflabs.com and normally when you create a UNIQUE index or a UNIQUE constraint across columns, you can insert multiple NULLs. In this example that he created, he has a UNIQUE constraint on val2. As you can see, you can insert as many NULL values as you want to into val2 and it's fine. Because a NULL by definition means an unknown value, so you don't know if it's unique or not. So that's acceptable. But what's coming in Postgres 15 is that you can now use this additional clause called NULLS NOT DISTINCT. So what that means is to consider NULL to be the same value, which of course means you can only insert it once. So in this example, using that NULLS NOT DISTINCT, you can insert one NULL value of course, but when you try to insert a second one, it's going to result in a duplicate key violation. So now, depending upon your use case, you can choose to consider NULLs to be the same or to be different. If you want to learn more, check out this blog post.
Next piece of content- "Introducing PostgreSQL 15: Working with DISTINCT". This is from percona.com and coming in the next version of Postgres is the ability for DISTINCT to operate in parallel. So in this example, they created three different types of tables: a one-column table, a five-column table and a ten-column table. He inserted a fair number of rows into each one and then did a DISTINCT from all the rows in each of these tables. And they have actually been tested in every version since Postgres 9.6. As you can see, Postgres 15's performance is better than any prior version and at the ten-column level, it's more than double the performance of even Postgres 14. And when you actually look at the Explain plan, you can see that it's actually doing the DISTINCT in parallel. Further, you can also adjust the max_parallel_workers_per_gather to get even greater performance. So if you want to learn more about this feature, definitely check out this blog post.
The next piece of content- "5mins of Postgres E26: New in Postgres 15- NULL handling in UNIQUE constraints & Parallel DISTINCT". This is from pganalyze.com and Lukas covers the previous two blog posts that I covered here Scaling Postgres. So if you want his perspective on it, you can definitely check out this piece of content.
The next piece of content- "PostgreSQL 14 Internals". This is from postgrespro.com and this is an announcement of a book called PostgreSQL 14 Internals that's been written. However, it's in Russian but it is being translated into English and part one of the book is complete already and it's freely available here as a PDF. So if you're interested in learning more specifically about the internals of Postgres, definitely encourage you to go ahead and check out this PDF.
The next piece of content- "Pg-agent - a Postgres exporter for Prometheus focusing on query performance statistics". This is from coroot.com and they're talking about a Prometheus which does metric exporters to monitor different metrics of your system. Now, there exists a well-known open source one called the postgres_exporter that collects all sorts of statistics about Postgres to be able to query or to present to users, maybe using Grafana. But they had some issues with it because they wanted to ask specific performance questions such as CPU-intensive queries affecting other queries, or a server has reached max_connections because most of the connections are stuck in idle and transaction. An I/O intensive query is affecting other queries, or a query is blocking other queries by holding an exclusive lock.
Now, they assessed that the number of changes required to get all of this information from the Postgres exporter would be, as they say, quote "...too massive...". So what they did is they decided to build their own exporter called pg-agent. This blog post talks about that and also goes into some of the queries that they developed to be able to focus on these performance pain points. In the example here, they're showing a pg_top_query_time_per_second for different queries, pg_top_query_io_time_per_second, a report of client connections in some of their state, and also looking at different locks, and they give you a quick start on how you can get started using this new Prometheus exporter. So if you want to learn more about this new open-source tool, definitely check out this blog post.
The next piece of content- "State of PostgreSQL: How to Contribute to PostgreSQL and the Community". This is from timescale.com. They're going back to the State of Postgres survey and they're revealing questions with regard to contributing to Postgres. It looks at the number of people that have contributed to Postgres and in what way, along with some particular quotes with regard to it. And then it goes into how you could potentially get started contributing to Postgres. So if you're interested in that, you can check out this blog post.
The next piece of content- "Patroni and pgBakcRest combined". This is from pgstef.github,io. This post talks about whether you're using Patroni to create a cluster of Postgres machines, how you can use pgBackRest to essentially backup that cluster. So if you want to learn more about how to do that, definitely check out this blog post.
The next piece of content. The PostgreSQL person of the week is Charly Batista. If you're interested in learning more about Charly and his contributions to Postgres, definitely check out this blog post.
The last piece of content. We did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This one was actually an interview with Brittany Martin, who is the host of the Ruby on Rails podcast. So we talked a little bit about podcasting, a little bit about project management and some other topics. So if you're interested in a developer interview format, we welcome you to check out our show.