background

BRIN Pitfalls, pg_stat_statement Troubleshooting, Natural Join, Geography Type | Scaling Postgres 126

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss BRIN pitfalls, using pg_stat_statements for troubleshooting, natural joins to detect duplicates and the geography type.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your family, and coworkers continue to do well. Our first piece of content is "Avoiding the Pitfalls of BRIN Indexes in Postgres". This is from crunchydata.com and BRIN is a block range index. Now in this post, they talk about blocks in pages, but pretty much they're interchangeable. That's basically the default storage unit of Postgres and a block range index indexes ranges of blocks. So it doesn't give you the specific data, but a range of blocks to look for the data in. That's what makes a BRIN index so much smaller usually. By default, it stores ranges of 128 blocks. Now to go over the pitfalls, they first create a test table here that uses generate_series to create a series of timestamps in basically one row per second. They actually use a page inspect extension so you can actually take a look at what the index looks like. So for block number 0, it's going to be this range here, and then at the 128 block number, the block range that is stored is here. 

It gives it a little graphical representation of what you can see within each page or within each block. So what that means is you have multiple blocks and that block range is 0 to 128. So this is the first row of the index, essentially. So when you're going to search for a particular value, it has to look through this block range to find the exact value you're talking about. So the index helps get you close, but not exactly like a B-tree index would. So that's why a B-tree index is faster to pull out single values compared to a BRIN index. Then they looked at how many pages their particular table was using and looked at the count of rows, which was 1 million. And basically, there are about 20,000 records for each BRIN record. So you can see why BRIN is so small. Then they have this really great chart here where they're showing the number of records, the relative table size, comparing it to the B-tree, and then the BRIN size. So you can see the BRIN is tenfold smaller at a small number of records. 

But here it looks like it's close to 1000 times smaller, 2 MB versus 21 GB or maybe 10,000 times smaller once you have a million records. So BRIN efficiency goes up the more records that you have. And then they show how fast it is to retrieve a record when you have these different record sizes. So as you can see, the B-tree index to pull out a single record is relatively fast, whereas the single record with a BRIN index seems to be ten times as slow to pull out that single record. But when you're looking at ranges, then the performance gets much closer. So much so that when you're looking at a month of data and a billion-row table, the BRIN index is actually a little bit faster in retrieval than the B-tree index. They have some of the conclusions here. One, BRIN space savings over B-tree are significant. Second, B-tree generally outperforms BRIN for single record queries. Thirdly, the query planner didn't even consider the BRIN index until there were at least 100,000 records in it. 

And finally, BRIN performs better comparatively when looking for more records. So the question is when should you use BRIN? They have a very good statement here that says to use BRIN when you have a large insert-only in-order table with large select results. So basically it works well when you have a lot of records in the table because again, it's operating on block ranges insert-only because order is important, as we'll see in a second in this post. Because if you have blocks covering all sorts of ranges, BRIN loses its efficiency, and you're pulling out a lot of records. If you have a huge table but you're just pulling out one record at a time, just stick with the B-tree index. But if you're rolling up data like they show here, like a day, a month, and a year, then BRIN starts to make more sense. Now in terms of what to look out for is when you have things out of order because then the block ranges don't align up neatly and your BRIN index will lose efficiency. 

They have this great illustrative guide here showing it. So basically this is the record layout for the pages and then at the top here, it shows you the range of records that are included within this page. So this is the range that the print index will be looking at. Now, when you have a DELETE, you're going to remove this record from here. Now once that space is vacuumed, it's going to clean it up. So now that record is no longer there. But then if you have a new INSERT, the space can be reused, and now say this new record gets inserted in this database here, where four was. So now if you look, the new range is 1 to 19. So if it needs to look up the value 10, for example, it has to look in two different ranges now. So that's a lot more data to search through than just searching through one block before. That's why it's important to use a BRIN index with contiguous data values, like a sequence or an ever-increasing timestamp. Then updates also have an issue. So in terms of an update, let's say you updated 12 to 4. 

Well, 12 gets deleted because that's how an update works in Postgres. Then a new record is inserted in this free page and that impacts the block range. So now instead of looking in the range of 17 to 18, it's looking from 4 to 18. Now, if you tried to select 12, well, 12 is in the entire range of all three of these blocks. So it's going to pull forward three blocks to try and search through. So then they did a test doing 20% DELETES and 20% INSERTS, and you can see how the performance degrades. So looking at 100 million records, one record goes from 10 milliseconds down to 3 to 8 seconds to retrieve it. And again, whether you pull an hour, a day, or a month, it doesn't really matter. But even looking at 10 million records, one record goes from 8 milliseconds to 680 milliseconds because of all these additional ranges it has to search through. So definitely in terms of the pitfalls of BRIN, you want to keep the data contiguous and not have very many, if at all, UPDATES and DELETES. So this BRIN works really well for append-only tables. So if you're interested in using BRIN, I highly suggest you check out this post. It was really great in terms of understanding how BRIN works and some of the pitfalls.

The next post is "A QUICK PG_STAT_STATEMENTS TROUBLESHOOTING HACK". This is from cybertec-postgresql.com. They have a scenario where they kind of need a monitoring system, but they don't have one implemented yet. But they do have pg_stat_statements. So basically, what they do is they create a snapshot table that takes snapshots of the pg_stat_statements table over time and stores them with a timestamp column. So you can see here it created a stat_statement_snapshots table with a timestamp column, followed by all the remaining columns from pg_stat_statements. Then this is just very quick and dirty. Just did a \watch to execute and insert the new data from pg_stat_statements into the snapshot table. Now, of course, you can implement this in different ways, but this is kind of quick and dirty. They call it a troubleshooting hack to be able to get some statistics over time relative to how queries are performing. 

Then they give you some different queries that you can use to find out and analyze the data. So for example, this one is which top three queries were burning the most CPU time between times t1 and times t2, as well as what queries had the biggest absolute execution time compared to the hour before? Now, he says you can also improve the usage here by doing a couple of things. The first is to create this snapshot table as an unlogged table. So what that means, is it's not going to write to the wall, it's not going to be sent to the replicas, but of course, it's no longer crash-safe. So you could lose the data at any time. But it does give better performance and doesn't impact the system as much as logging. Essentially, the second thing to make it a little bit more efficient is maybe don't include all the queries every time from pg_stat_statements. Or third option is not to update a record in the snapshot table if it hasn't been updated since its last update. So again, this is a pretty interesting way to create a hack to be able to monitor your queries that are executing in your Postgres system. So if you're interested in that, check out this post.

The next post is "Use NATURAL FULL JOIN to compare two tables in SQL". This is from blog.jooq.org. So he had seen a post that was using UNION and ACCEPT to compare two tables and show the differences, but he felt it was a more efficient use to use a NATURAL FULL JOIN. So this compares all the rows of the table between different ones and finds the differences. So here's his implementation here and it basically shows you the differences as well as the table where the values differ. Now he did say this will not work if you have NULLS in your table, but he did give another implementation where there are null values in the data, basically using the DISTINCT predicate. So if you want to use this technique to compare two different tables, definitely check out this post.

The next post is "PostGIS and the Geography Type". This is from crunchydata.com and this explains the geography type that's available within PostGIS for calculating spherical coordinates. He talks about the importance of this because normally we think of flat planes when looking at a map, like Cartesian coordinates an X and a Y. But really we need to think spherically because the Earth is a sphere and whereas it looks on a map that 'Oh, look. New York exists, right between Los Angeles and Paris, France'. That's not how it exists on the Earth. 

Actually, the shortest distance is this route here because along the sphere this is the shortest distance. So it's important in many areas to use this geography type to deal with spherical coordinates. Now, he does say something like if your working area is geographically small enough, a state, a region, or a country to fit in a projected system using a projected system and the geometry type for all your calculations will result in simpler SQL and faster queries. But if you have global data, you'll find your system is simpler to code and reason about if you use the geography type. So if you're interested in PostGIS and learning more about the geography type, definitely check out this post.

The next piece of content is "Webinar: Being Committed - A Review of Transaction Control Statements 3/3 [Follow Up]". This is from 2ndquadrant.com and it's the third series about transactions in Postgres. They cover transactions, locking, why it's important to avoid collisions, and how you can improve transaction flow by reducing contention. It talks about the blind spot of inflight commits. So if you have a COMMIT and then the system crashes, you don't know if that COMMIT was committed yet. They show kind of a way around that, a way to recover after failing at the worst time is the way they phrase it here. And then COMMIT at most once, showing how to recover after failing at the worst time and in the worst way. So if you're interested in that, definitely check out this webinar. You can just click the link here to register for it.

The next post is "BEYOND JSONB: A GENERALIZED, UNSTRUCTURED DATA TYPE FOR POSTGRES". This is from ongres.com. So this is talking about JSONB and how the different types that are supported within JSONB are the number, the boolean, the string, null, array, and object. But they bring up the point of how you handle things such as binary data. Because there's no real great way to do it. You're going to run into issues if you try to do it as a string. How do you handle Internet types, point types, et cetera? And they're arguing that perhaps Postgres should have a superset of functionality to support things within JSON because they mention all these other implementations that actually have added features to the JSON standard to be able to support more complex data types. So really, this is just a discussion and an opinion piece. If you're interested in learning more, definitely check out this post.

The next piece of content is "Continuous PostgreSQL Backups using WAL-G". This is from supabase.io. So if you're interested in using WAL-G for doing your backup and restore procedures, definitely check out this post.

The next piece of content is "NSS on Windows for PostgreSQL development". This is from 2ndquadrant.com. So NSS, as they say here, is an alternative TLS library to OpenSSL, so if you want to use that with Postgres on Windows, you can definitely check out this blog post.

The last piece of content, the PostgreSQL person of the week is Tatsuo Ishii. So if you're interested in learning more about Tatsuo and his contributions to Postgres, definitely check out this blog post.

episode_image