background

max_locks_per_transaction, Logical Replication, Backup & Recovery, Pgbouncer | Scaling Postgres 95

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss max_locks_per_transaction, logical replication, managing backup & recovery and pgbouncer connection pooling.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "POSTGRESQL: YOU MIGHT NEED TO INCREASE MAX_LOCKS_PER_TRANSACTION". This is from cybertec-postgresql.com. Basically, he's indicating here you may see an error that says out of shared memory. It indicates you might need to increase max_locks_per_transaction. He shows you where you can actually see this if you generate 20,000 CREATE TABLE statements to create that many tables. used this generic generate_series function to be able to use these statements and create all these tables. Eventually, it says "error out of shared memory. Hint you might need to increase max_locks_per_transaction". So because this is all occurring within one transaction, it can't create that many locks. 

By default, if you do a SHOW max_locks_per_transaction, it's 64. So you may need to increase that if you want to have more locks per transaction. Now, I've seen this myself sometimes when doing a pg_dump when I've had an incredibly high number of tables. This may have only impacted earlier versions, but I actually had to increase max locks per transaction to be able to handle a pg_dump with a large number of tables in the database. And he says something interesting of note here: "The number of locks we can keep in shared memory is max_connections x max_locks_per_transaction. Keep in mind that row level locks are not relevant here". So, for example, doing a SELECT ALL from a table with a lot of rows for update, these row locks are stored on disk and not in RAM. So something to keep in mind. 

Now, he also mentions that if you want to see what kind of locks are in action, you can use the pg_locks table. So he discusses how you can query it to find out what locks are present. Now, another issue that you may see this for is if you are doing partitioning and how it can relate to this "out of shared memory" error. So he used this code to generate 1,000 partitions and then simply querying the parent table is going to hit all 1,000 partitions. You'll see in pg_locks that it generates over 1,000 access share locks, trying to pull back all the data for all of this. So this is something else that could hit that limit of max_locks_per_transaction. So, this great blog post talks about max_locks_per_transaction as well as the "out of shared memory" error that you may see in your installation. So definitely a blog post to check out.

The next post is "RECOVERY.CONF IS GONE IN POSTGRESQL V12". This is from cybertec-postgresql.com and basically, this follows on with a number of other posts that we've seen where version 12 has gotten rid of recovery.conf. You now make those configuration changes in the postgresql.conf file. So now instead of the recovery.conf file being present, you either have two signal files, a recovery signal file that tells Postgres to enter normal archive recovery or a standby signal file that tells Postgres to enter a standby mode. They go over some different considerations here and basically, your backups don't really need to change but restoration processes need to change as well as how you set up replicas now because these will be impacted by these changes. 

He also advocates the use of the postgresql.auto.conf file because that's something set by Alter system commands and generally is what's happening with pg_basebackup now. He says if you do use third-party software, here are the particular versions of pgBackRest, pg_probackup, and Barman that you should be using that support PostgreSQL version 12 and its new recovery scheme or recovery process. So, yet another post to review if you are making the transition to PostgreSQL 12 and then what changes you need to make with regard to recovery or your replicants.

The next post is actually a YouTube channel where they've posted updated presentations for pgDay Paris. So this is the pgDay Paris YouTube channel and there are about eight presentations here that they posted for what happened in 2019. So if you're interested in some video content, definitely a link to check out. Another piece of video content is "Logical Replication in PostgreSQL". This is from the EnterpriseDB YouTube channel and they're basically talking about what is logical replication and how to set it up, how to use it, and all of those sorts of things. Now, this webinar is about 54 minutes in length and it's about the twelve-minute mark where it really starts into the meat of the presentation. But definitely, if you're interested in logical replication, this is a presentation to check out.

The third piece of video content is actually a webinar. "You forgot to put the WHERE in DELETE? [Follow Up]". This is from 2ndquadrant.com. So this talks about descriptions of database backups, the type of backups you can take from PostgreSQL server, basically logical based backed up where you're backing up objects or physical-based backed up where you're backing up the raw files and how to do that different use cases for backup disaster scenarios and what are the best ways to recover from them. So you didn't put the WHERE in the DELETE statement or server hardware has crashed, what are some different ways to handle recovery scenarios and things to consider when taking backups? So again, this is about an hour in length. So if you're interested in this content, definitely check it out. You can click the link here to go ahead and register and you'll get immediate access to the webinar.

The next piece of content is "PostgreSQL Connection Pooling: Part 2 - PgBouncer". This is from medium.com. So this is a second post about PgBouncer and its use as a connection pooler. It talks about how it works, and how you can set up authentication. So you're connecting to a PgBouncer as if it were a Postgres server. And then it basically pulls the connections and uses fewer connections on the database side. It goes through the different configuration settings and how you can adjust certain things to increase the pool size, and certain things to increase the number of max_client_connections, max_db_connections, and max_user_connections. So it has this representation here. Basically, you keep more live connections open here using fewer database connections. Now it'll use fewer as long as you're using certain pooling modes. 

So the most popular pooling mode is transaction mode, where each transaction will run on a separate PgBouncer connection. There's no guarantee they're all going to be running within the same session. You can achieve that with session pooling mode. But again, that doesn't let you use fewer connections on the database server because it's a one-to-one ratio, one session on PgBouncer, one session on the database. But in transaction pooling, you could have multiple transactions, actions that can actually happen across sessions. So that enables you to use fewer connections. It's essentially a many-to-one possibility. 

So it lets you be more efficient. But then you can't set sessions and do certain things in that way. Then you also have a statement pooling mode, which is also popular, but generally, the configuration is done using transaction pooling. So they go into discussions about why you would choose PgBouncer over some other solutions. And then what can PgBouncer not do? Basically, high availability or failover that you may be able to get through things like from things like a Pgpool. So if you're interested in learning more about PgBouncer, this is a blog post to check you out.

The next one related to PgBouncer is "Can PgBouncer handle failover to a new machine?". This is from the enterprisedb.com blog, and they talk about PgBouncer and how they have set up things. However, EDB has a tool called the Failover Manager that uses a virtual IP capability. So with being able to have a virtual IP address and then flip that out, you can do failovers. So presumably some of the things that PgBouncer cannot do, they mentioned here. You could use a tool such as the EDB Failover Manager to be able to do those sorts of things to have a bit of a seamless failover, and they discuss and show some of that here. So if you're interested in using PgBouncer, these are our two posts that you may want to check out.

The next post is "DBLog: A Generic Change-Data_Capture Framework. This is from the Netflix Technology Blog on medium.com. And this is a new CDC or a new change data capture tool called DB Log. So it basically monitors the log files from different database systems. They talk about MySQL, PostgreSQL, and MariaDB. So it does support PostgreSQL right now, and it looks for changes to the data and then streams those elsewhere for say, applying to a data mart, a data warehouse, or you need to kick off some sort of processing. So it goes over in depth to this tool and how it works and why they chose to design it, versus some other CDC tools that exist. So if you're using change data capture for certain use cases, and you perhaps want to look at a new tool that could offer some different features, definitely check out this blog post.

The last set of posts are all related to PostgreSQL table functions. Now, by table functions, they're referring to functions in PostgreSQL that would actually return a table of data. So this goes over an introduction about what they are, how you can use them, and the different ways that they can be configured. Now they're just functions, but they return essentially a table of data. An example is generate_series so you can do a SELECT from this function. So this is kind of like a table function. Now, this is from the YugabyteDB blog on meduim.com, but all the content I mentioned here is actually applicable to PostgreSQL as well.

The second post in the series is implementing PostgreSQL user-defined table functions in YugabyteDB. But again, you could do this in PostgreSQL and it talks about the implementation and how you can set up different functions to do this.

The last post is "Four Compelling Use Cases for PostgreSQL Table Functions". So the first case is you want a parameterized view. The second case is pretty printed ad hoc reports for administrators. The third case is a dynamic end list. The last case is a compact syntax for bulk inserts. So, three sets of posts talking about table functions. So if you're interested in that type of content, definitely some blog posts to check out.

episode_image