Faster Index Creation | Scaling Postgres 326

In this episode of Scaling Postgres, we talk about speeding up index creation, extensions to track wait events, a row pattern recognition feature and savepoints.

Content Discussed

YouTube Video

Podcast Audio

Transcript

I was doing some searching this week, looking for information on how to potentially slow down index creation. Because there's a particular phase of index creation that really saturates the disk, and I was wondering if there's any way to minimize that. But in all my searches, I could only find people talking about how to speed up index creation because that's what most people are wanting to do. Well, our first blog post talks about exactly that.

But I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is “Speeding up index creation in PostgreSQL”. This is from cybertec-postgresql.com and he wanted to test out different ways of speeding up index creation. So he created a test table with four columns and integer id in some sequential order. A random integer, then a random floating point number, a double precision, and then a random numeric, which I think is even larger than the floating point number. And using Generate Series, he created a billion rows. So the table ended up being about 56GB in size. I did a Vacuum Analyze on it. Then he said, okay, let's create our first index. It's just going to be on the sequential integer, the id column, and that created it in 4 minutes and 51 seconds.

Now, while an index is running, there are progress tables in most recent versions of Postgres. So, for example, you can track the progress of the create index command using this system view here, and you can track its work as it's doing things. So here you see it's building the index by scanning the table. And now it's moved into the sorting live tuples phase, and it shows you its progress with the number of blocks, and then it moves into loading tuples in the tree, and you see the progress of the tuples down here.

Now, as a side note, this is the exact phase loading tuples into the tree that I was trying to see if I could slow down because it totally saturated the disks being used writing all of this data, and the disk queue started expanding. The only saving grace is that it is a very, very brief period, probably because it's using all the writes it can get to write the index out. But you can try this for yourself. The next time you create an index, you can track its progress and you'll see at this point here that those disk writes go off the charts once you hit this phase of the index creation.

But getting back to the blog post, the next step was he created an index on the random ID. And as opposed to taking almost 4 minutes, this took 6 minutes and 41 seconds. So about two minutes longer. So basically that sorting tuples phase took a lot longer when it was a random id compared to a sequential id. The next thing he looked at is the double precision or the floating point number and that took 7 minutes and 21 seconds. So the data was a little bit larger and in a random orientation, so it took longer than even the integer and then the numeric took 13 minutes and 19 seconds. So a really long time. So he said the thing to basically keep in mind is that you can't just add more memory, more CPU's or even faster disks and expect the index creation to be done faster because you also have to consider the factors of what data type it is and how random that data is on the physical disk itself.

So he tried bumping up the max_wal_size to basically extend checkpoints, increase the number of max parallel workers to allow more workers to be done during index operations, increase the maintenance work memory to give more memory to the index creations, and then bumped up the shared buffers as well. Now after that reconfiguration, the numeric index operation went from over 13 minutes, like I said, to almost eight minutes, which as he says is not even double the speed. So it did help, but not in a hugely significant way. And this image here shows another example of different things happen during different phases. For example, like I mentioned, loading tuples in the tree is really heavy on the disk writes, whereas during the sorting operations where he took this image, you could see the cpu cores here are all at 99% trying to do the sorting operations. So different phases of the index creation use different resources on the system while creating those indexes. So these have been some ways you can speed up your index creation. But if you want to learn more, check out this blog post.

Next piece of content “Wait a minute - PostgreSQl extension pg_wait_sampling” this is from andyatkinson.com dot. So when a process that basically manages a postgres connection is doing work, doing some type of query, a lot of times it may encounter a wait event, it could encounter locking or be waiting on IO. Some sort of wait event is happening. Now you can use pg_stat_activity like he mentions here to do a query to check out what those wait event types are and what the wait events are to see what is causing a query to slow down or hesitate. So because it's PGSTAT activity, you get real time information about what a query is encountering while it's running. And you can join it, as he mentions here to the PG locks table to get additional insights into what is potentially blocking it. So you could understand why a particular query is slow.

Now the disadvantage of it that he mentions here is that this is live immediate data. It doesn't give you a historical perspective. So this really doesn't help you much if you're trying to analyze why a one millisecond query is now running in ten milliseconds, what wait events are causing it to hold up, because chances are something that fast you're not going to be able to catch in PG stat activity or PG locks. So that's where this extension he talks about pg_wait_sampling comes in. So it is an extension. He shows how to load it into a Mac OS here, and it does have a real time view called pg_wait_sampling_profile where you get the process id, the event type, the event query id, and the count. But it also has a pg_wait_sampling_history. So this introduces a timestamp as well. So you can see precisely when these wait events are occurring.

So I kind of think of it a little bit like PG Stat statements, where it collects statistics on the historical running of statements. Well, this gives you insight into the historical wait event types that are happening for your queries. And he does show here where this extension is available on different cloud products like Google Cloud supports it, Tembo has support for it, Microsoft Azure and AWS RDS do not. However, AWS's performance insights, that's their main thing that they bring up when you look at it, is a wait event analysis. So essentially their AWS console really showcases the wait events that are happening for your system under the performance insight. So they don't necessarily need to support this extension, I suppose. But he has a couple of resources here, so if you think this would help in your postgres analysis, you can check out this blog post.

Next piece of content also related to wait events is “pg_statviz 0.7, released with new features, PG 17 support”. This is from vyruss.org and this is a minimalist extension, he says, for viewing wait events, and it basically collapses down many, many data points to present a hundred point plot. And it looks something a little bit like this. So you can see total wait events and it looks like the next highest one is waiting for a client read. So if you want a quick visualization of wait events, maybe you'd be interested in checking out this extension as well.

Next piece of content “Row pattern recognition feature for PostgreSQL this” is from pgsqlpgpool.blogspot.com. And here he's working on a new feature that is listed in the SQL standard, and that is row pattern recognition. So at first I heard this and I was thinking, you're talking about a where clause. I mean, that helps you recognize patterns in rows. But really what this is, it's a window function feature. So it helps you look over windows of data and patterns of the data changing to identify what rows follow particular patterns. And I'll go through some of the examples here so we'll be more clear.

And in this example, he's tracking the stock price of a company. So you have a company, a date field, and then the stock price goes up and down over the different days. Now, in terms of the row pattern recognition syntax, you have a define keyword, and he has three variable names. Start, which is set to true. Then the next one is up, which is an expression that says if the price is greater than the previous price, that will essentially be true. And then the next definition is down, where the expression is, if the price is less than the previous price, this expression will be true. And then you have a pattern keyword that you specify what patterns you're searching for. So he's doing start, then look for up, and then look for down.

Now he's got the whole syntax here, but let's take a look at him actually running it. So what this means is we have a pattern. So it's going to start off true in the next row. If the price goes up, it will still be true. So the price goes from 100 to 200. It's still true. And then the price goes from 200 to 150. Well, up is no longer true. It's no longer still going up. However, now it satisfies this expression in that it went down, and then it goes from 150 to 140. It still satisfies the going down expression. But then once it goes from 140 to 150. So essentially it goes up again. That doesn't satisfy this expression at the end. So basically it's done. So this was the first examination. It starts, it goes up, and then it goes down for two rows, essentially. And then this pattern concludes. So you can see the count of rows four that are part of this pattern. And when you're starting at 150 here and you go down to 90, that doesn't follow the patter It's not first moving up. So this row is not included. It is not a pattern match, whereas it does match here, where you go from 90 to 110. So it went up and then it went up to 130, which still satisfies the up condition here. Then it goes down to 120. So it's now at this down pattern match. But then once it goes up again, the pattern match fails and you now have a new sequence of four rows.

Now, I've done my best to explain this, but if it's confusing, I definitely encourage you to check out this blog post because I had to look at it a couple of times to understand exactly what was going on. But he also said you can also alter the pattern so you could define the up as not just that the current price is greater than the previous price, but you can also add $50 to it. So in this condition, it means that the price difference going up has to be more than $50. And the first pattern still matches going from 100 to 200 because that's definitely more than dollar 50. But this pattern that previous matched going from 90 to 110, well, that is much less than $50. So that pattern does not apply. So he talks a little bit about the current implementation and where the patch is. So again, this is something that is actively being worked on for inclusion in postgres. And if you want to learn more, you can check out this blog post.

Next piece of content. “You make a good point! - Postgresql save points” this is from andyatkinson.com, and save points are essentially sub transactions. So you have transactions that you can begin, do some work and then at the end commit them or roll them back where nothing happens essentially, for that transaction. You can also save your work at intermediate points within that transaction, and those are sub transactions. Or the command you use to do it is called save points, so you create a save point at a particular point. And then, as he says here, you can roll back to that point if you need to during the transaction. Or you could also choose to release a save point that you've saved. So as a very simple example here you begin a transaction. You insert some information into a table, then you create your save point. He's calling it SAVEPOINT a. Then he inserts some more data. The count from the table is two records. But then if he does a rollback to SAVEPOINT a, essentially it's as if this insert never happened. And then once you commit, you still only have one vehicle that has been inserted because you essentially rolled back this insert that happened. And he has another example here where you can create a save point and then release it as well.

And he talks a little bit about reusing save points and some errors as well. I will also mention, to be cautious here, because at scale, sub transactions can cause issues. So much so that there was a blog post created called PostgreSQl sub transactions considered harmful. This is from Postgres AI back in 2021, and it's actually done by Nikola here. So definitely if you're at scale, something to be cautious of. But check out this blog post if you want to learn more.

Next piece of content “Magic Tricks for Postgres PSQL: Settings, Presets, Echo, and Saved Queries” this is from crunchydata.com. And this is just a review of different things you can do in PSQL. They show an example of the expanded display which is \x. They looked at altering the line style the output using the \pset linestyle command, turning on runtimes by \timing setting a default null using a \psetn null command setting up your history file and how many commands to save in your history, using the sethist file command.

You can echo PSQL commands to see what do the backslash commands actually do, like \dt plus what queries are running against it. Well, if you have echo hidden on, you can see the actual query that that DT command is executing, or you can set to echo all queries just so you can see what command is being run. Which as I mentioned here, is a great thing to do when you're running commands from a file. They talk about setting up your default experience using the psqlrc file, different ways to customize your Psql prompt, and then adding some aliases for commands in your psqlrc file as well. So check out this blog post if you want to learn more.

Next piece of content there was another episode of postgres.fm last week. This one was on out of disk. So what do you do when your postgres database server runs out of disk? And Michael and Nikolay talked about three different areas where this could happen. Maybe your data and indexes have exceeded the disk, or maybe your wall has exceeded the disk. There's too much write ahead log or maybe just the logging has exceeded the disk. What postgres logs to the disk? If your data and indexes are exceeding the disk, basically the easiest thing to do is probably expand the disk if you can do it. A lot of cloud providers have that built in capability, but if you don't have a ready means to do that, then usually just find something to delete. So they were mentioning how they've had some clients that say oh well, we don't even need this table. So maybe you have some sort of temporary tables that are hanging around that were used for testing purposes. Or maybe you have indexes that aren't being used at all or scanned. Those could be quickly deleted to relieve the situation temporarily, and then you can examine more in depth what to do. So a more intermediate to longer term thing you could do is just re-index your indexes. So for particular clients I have just by re-indexing, saved 1020 30% of disk space. Just because the indexes have become so bloated and it's super easy to do a re index operation, it's not really anything that users will notice.

You could also look into compacting tables, so there's a few compaction methods that were mentioned. PG repack is a main one, but another great means of minimizing the disk usage and eliminating bloat is doing a logical replication transition to another server, or logical replication upgrade if you want to upgrade at the same time, because that essentially gets rid of all the bloat that's there in the tables and the indexes.

The other area they talked about is your wall files growing excessively and using up all of the disk. And Michael said the main area he's seen with regard to this is basically there's an orphan replication slot that's around, so something's no longer consuming it and the wall just continuously builds up. So whatever you do, don't delete what's in the PG wal directory. You want to find out why it's growing and then address that. So orphan replication slots are one reason. It could also be you've set your max wall size excessively. Or another reason is that for some reason the wall's not being archived. So is there something wrong with the archive command? And the files just stay there, they don't get archived. And then the last area they covered was basically excessive logging. So not the write ahead log, but just the logs of what's going on with the postgres system itself. Now, old log files can easily be deleted. They're not going to bring the system down, just make sure it's not the right headlog. But if you want to learn more, you can listen to the episode here or watch the YouTube video down here.

Next piece of content “Which cloud providers support auto explain?”. This is from pgmustard.com. and basically what auto explain does is if you have a particular query that's slow, it will log the explain plan of that query. So you can look at it and see what the reasons are. And they're showing what providers support it as well as the number of parameters. So not every provider supports all the parameters available in auto explain. So there's actually 13 providers here and it looks like more than half of those support 12 to 13 of the parameters available, and then they go into much more detail about which provider supports what particular variable. So if you want to learn more, check out this blog post.

Next piece of content “How and when to use btree_gist this is from neon.tech”. They're looking at an example of using a B tree gist, which is a combination of a B tree index and a Gist index. Again, gist index is used a lot for spatial data and some text searching purposes, but in this example they're using a crime database published by the UK police forces. So they take the CSV data, they load it into a table, and then they showed a crosstab of the different crimes and then the different statistics throughout the years. But really they also wanted to make it spatial so you can see what particular area this crime was taking place. So they installed the post GIS extension and then added a geometry column to be able to track this. But when they were running the query it was relatively slow. It took 7 seconds to return a query. Creating a gist index on the location on the geometry caused it to run in 30 milliseconds so that's much much better. But he says hey, we can go even further using a btree_gist index. So created the b tree gist extension, created the new index on the location and the month and as a result this is almost four times faster using this btree_gist index. So if you want to learn more, check out this blog post.

The last piece of content is actually a YouTube channel that showcases all the videos from PG Conf.dev 2024, so feel free to check this out if you want to see all the video content from PG Conf.dev.

Faster Index Creation | Scaling Postgres 326

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript