BRIN Indexes, Partition Enhancements, Tips, Standbys | Scaling Postgres 73
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss BRIN indexes, partition enhancements, general tips and standby scaling.
Content Discussed
- BRIN Index for PostgreSQL: Don’t Forget the Benefits
- Partitioning enhancements in PostgreSQL 12
- Postgres tips for the average and power user
- Standby in production: scaling application in the second largest classified site in the world
- The mysterious “backend_flush_after” configuration setting
- Suggesting Single-Column Primary Keys (almost) Automatically
- Checking PostgreSQL Version in Scripts
- Per-User Encryption with Postgres
- pgBackRest S3 configuration
YouTube Video
Podcast Audio
Transcript
All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "BRIN Index for PostgreSQL: Don't Forget the Benefits". This is from percona.com. Basically, they're talking about a BRIN index, which is a Block Range INdex. So basically it stores ranges of values. So this type of index is optimal for things that are going to be stored incrementally. So for example, you're storing data in a table by date or by a primary key that's incrementing. That's what is best used for this type of index. It says basically that BRIN usage will return all the tuples in all the pages in the particular range that's chosen. Basically, the index is lossy, so it doesn't have an exact representation, it stores ranges and then it can pull up that range and then find specific data that you're looking for. So sometimes it does take a little bit more to do a search, but the benefit of it is the size is incredibly small and that can help it with caching.
So they basically created a table, inserted about 8 million rows into the table and then they did a query on the table with no index and it finished in about 1.8 seconds. They created a B-tree index. Now that query instead of 1.5 seconds takes 2.6 milliseconds. So super fast with the B-tree. And the size of that B-tree is 171 megabytes. So now they take a look at the BRIN index. So they use the USING BRIN when creating the index. And that same index was a B-tree, 171 megabytes. As a BRIN index, it is 64 KB. So vastly smaller, that may be 0.1% of the size, so vastly smaller than the index. So that has a higher probability of being cached, definitely. Now the query finishes in 87 milliseconds. So again, the original query was finished, with no index, was 1.8 seconds, so 1,800 milliseconds, whereas this one finishes in 87 milliseconds.
So not bad, a pretty good improvement. However, the B-tree still beat it with 2.6 milliseconds. So this example here, it's about 40 times slower. So you get huge space savings at the cost of not the necessarily best performance, although it probably depends on the query. Now they also go into a section talking about storage and maintenance of it and that you can specify pages per range. So this is basically how many pages are going to be in a particular range. So basically you can make it less lossy and presumably you can improve performance a little bit at the cost of taking up more space. So this is thinking of it, I kind of think of it like a JPEG image. You can have a higher-quality image that's a lot larger or a lower-quality image that's a lot smaller. So I think the BRIN is kind of that same concept. So if you're interested in potentially using a BRIN index, definitely a blog post to check out.
The next post is "Partitioning enhancements in PostgreSQL 12". This is from the 2ndquadrant.com blog and they basically take an example from their previous post on generated columns in PostgreSQL 12. They're doing it like a data warehouse example and using partitioning. So they create a media calendar, breaking out all sorts of different date times as integers in a single table. Then they have a time dimension where they look at the time of day, the hour of day, the minute of day, and the second of day, and then they're going to do a cartesian product of joining these two together and placing the results into a partition table. So the intermediate table is an hours to days table and they partition by range day. Then they put all of these values into these partitions as defined here. Basically, they're going to be generating 6.3 billion rows.
Now, getting into the first enhancement, it's basically that quote "...you may now have a foreign key relationship where the partition table is the child". So here they created a sale table and referenced it as a foreign key as one of the child tables. So this is a definite benefit for integrity purposes. And then they also mentioned some new introspection tools, whereas there's a pg_partition_tree, pg_partition_ancestors, and pg_partition_root. So you can do a \d+ and they even have an s+ to talk to display all the partitions within pgsql. But they also have functions that you can use that show you the partition tree. So for example, the hours today are the parent. It defines if it's a leaf or not and what level it's at. So you could have essentially multiple partition levels and it references what the parent relid is for it.
There's also the pg_partition_root to identify what the root table is as well as pg_partition_ancestors. The other thing mentioned is that altering a table to attach a partition now does not require an EXCLUSIVE LOCK. So that's great. Now the detached partition still does, but at least the attached partition does not require an EXCLUSIVE LOCK anymore. So that's definitely a benefit. They list three other benefits here. One, the COPY command has reduced a bit of overhead allowing for faster loading. Two, the table space specification for a parent is now inherited by the child. And three, pg_catalog.pg_indexes now shows indexes on partition children. So definitely some great enhancements for partitioning coming in version twelve.
The next post is "Postgres tips for the average and power user". This is from citusdata.com and the first one they mentioned is actually one I haven't heard of or used before in psql, you can actually tag your queries and use Ctrl+R to do a search. So you can just use Ctrl+R. You don't need to necessarily tag them, but you can just type part of a query and it will pull up the different queries that you've run historically and you can just hit Enter and run that command. So that's pretty interesting. But he's talking about using a tag with a comment here to be able to find a particular query you're looking for. And then he talks about a better psql output. So there is the \x option that he talks about, something called a PGDG that you can install that he likes better and gives you some configuration you can use to your .psqlrc file in order to enable it.
The next is getting a stack trace for an error, so he goes through the process of being able to do that. Talking about inserting fake data. Again, if you've watched previous episodes of Scaling Postgres there's been many blog posts about generate_series, so they're talking about it here. They mentioned a better pg_stat_activity. So give some suggestions on how to better read the active statements going on in PostgreSQL as well as a process to manage multiple versions of Postgres that you may be running using pgenv. So some tools that you could use in your Postgres environment and another blog post to check out.
The next post is "Standby in production: scaling application in the second largest classified site in the world". This is on Avito Tech on medium.com and they have a very large Postgres installation, greater than 20 terabytes on over 100 nodes. This is Avito, which is the biggest Russian classified site and the second-largest classified site in the world. So this post is very, very long and pretty comprehensive and it goes into basically all about standbys and talking about how they can be used from a high availability and a scaling perspective. He talks about physical replication, which is you're streaming the WAL files essentially and talks briefly about logical replication, but primarily it's about physical replication and using standbys to be able to horizontally scale reads.
And he goes a little bit into the history and then some specific things that they've run into and how they've been able to resolve them. Some of the main things he covers are how to avoid stale reads from a replica, some techniques they use compared to some other people, and how they have implemented some caching. They discovered some issues with regard to deadlocks they were having on a replica, discussing issues related to excessive WAL files and how they handled that. They talk about standby pools. So this is a really comprehensive blog post that talks about a number of different items that they have encountered as they're scaling Postgres. So if you've run into problems with scaling as it relates to standbys, this is definitely a blog post to check out because he does cover a lot of different information that's a little bit too long for me to go into comprehensively in this episode, but definitely a blog post I encourage you to check out.
The next post is "THE MYSTERIOUS 'BACKEND_FLUSH_AFTER' CONFIGURATION SETTINGS". This is from cybertec-postgresql.com. There is this setting that I haven't heard about or looked into, but they heard Andres Freund who basically said, quote, "...if your workload is bigger than shared buffers, you should enable 'backend_flush_after' parameter for improved throughput and also jitter". So they said, well, let's take a look and try it. So basically it communicates something to the OS about how to handle some fsyncing. That's my interpretation in a nutshell. So he decided to go ahead and test this. He used a 4 CPU, 8 gig system, and pgbench. And with it set at its default, it finished in 0.637 milliseconds.
When he set that to 512 KB, it wasn't in significant change, less than 1%. But then increasing it to megabytes, he got close to a 4-5% difference in terms of better performance. So this is another one of those configuration settings I would say probably depends on your load and what you're doing. So this is one where if you want to try to squeeze a little bit more performance out of it, take a base measure of your system, alter this parameter, and see if you get a difference in the performance of your system and your workload to see if it could be beneficial. So if you're interested in getting a little bit more performance, definitely a blog post to check out.
The next post is "Suggesting Single-Column Primary Keys (almost) Automatically". Now, this is a continuation of a blog post previously from fluca1978.github.io. Now, he talked about having to generate primary keys for a great number of tables in a database with a lot of different tables that didn't have any. He's come up with an automated method to identify which is probably the best primary key, again by looking at the system tables. So he has all his code right here and he's consulting the pg_stats table in order to determine what are some good candidates. He goes through his methodology here and then runs it to suggest certain primary keys for different tables. So if you have a particular use case for this, definitely a blog post to check out.
Next post is also from fluca1978.github.io titled "Checking PostgreSQL Version in Scripts". So basically he had a psql script and he wanted to be able to check the version as it relates to partitions because sometimes he was using something that's version 10 that he wanted to use this particular type of syntax. And then version 11, a slightly different one that had a default partition. So he came up with this method to determine which Postgres version is running. So that he can conditionally choose it during the query execution. So if you have a use case for this, definitely a blog post to check out.
The next post is "Per-User Encryption with Postgres". This is from The Startup on medium.com. This is a pretty simple post to start off in regards to Postgres, but he's talking about basic encryption that you can do using the PGP_SYM_ENCRYPT and PGP_SYM_DECRYPT to encrypt and decrypt data within the database. He talks about how you can basically define an environmental variable on your application server and then be able to encrypt and decrypt data without having the keys on the database. He takes it a step further and talks about Sandwich Encryption, where, for example, maybe your users have their own key and the server has its own key, and basically, you double encrypt it with both of those. So that way even you can't decrypt the data. Only the user with their known decryption key can do it. So if you have a need to do that, definitely a blog post to check out.
The last post is "pgBackRest S3 configuration". This is from pgstef.github.io. pgBackRest is a separate backup tool for PostgreSQL and you can configure it to use S3 to store and retrieve data, and this blog post shows you how to do that. Now, it doesn't use S3, but he actually uses a tool called MinIO, which is a server that's compatible with Amazon S3's object storage. But nonetheless, he goes through and shows you how to configure it to use pgBackRest and does a backup with pgBackRest and shows how to configure it and get it running. So if you're interested, definitely a blog post to check out.