Replication Complications, One Word Performance, Pgbouncer Users, Serializable Isolation

In this episode of Scaling Postgres, we discuss replication complications, adding one word to boost performance, handling pgbouncer users and serializable isolation testing.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PG Phriday: 10 Things Postgres Could Improve - Part 2". This is from 2ndquadrant.com. In part one, they talked about XID or transaction ID wraparound complications. In this one, they talk about replication complications. The first thing they cover is replication in general and how it's pretty much based on the WAL files because they talk about replication in general and how it's pretty much based on the WAL files. They say version 9 here is basically log shipping. You have the right ahead log and could transfer those files to another server and replay them. So essentially that is a delayed form of replication. But streaming replication didn't happen to version 9 and it's still all based upon the WAL files, the write-ahead log that Postgres generates.

Now, in terms of complications that they're mentioning, the first one they're talking about is synchronous replication. He says if you have a primary node, you create a table, then you stop the synchronous standby. You begin a transaction, insert a value into it, and then COMMIT it. It should hang because the synchronous is not available. But he says if you now cancel that transaction and do a select from the table, you will actually see it there. So the transaction, as he says, is alive and well. So basically your synchronous standby missed this COMMIT and the reason is that it's all based upon WAL files and that this COMMIT has to get into the file to know that it's there. But it knows it can't go any further than that because the synchronous standby is not available. And he says it also doesn't do any quorum in relation to this so that won't help either.

To his assessment, the only safe way to use synchronous replication is to deploy at least two replicas and then only hope that one goes down at a time. So there is a risk of losing some data in this case of course, because if your synchronous standby goes down and then your primary goes down, well then you've lost this data. Basically, as a best practice, he said the easiest way to address some of these deficiencies is to always maintain at least one additional active synchronous stream. So if you want to always have one, we'll have a second one as a redundancy, or if you want to have two, we'll then add a third as a redundancy. Now, in terms of this redundant connection, he actually recommends using something like a pg_receivewal that just streams the WAL information from the primary database. So you don't have to keep a copy of the database on this.

All it does is stream the WAL files so you have a full set of the WAL files and you can do this in synchronous mode so you don't have to have a full copy of the database. You can have this running. Now, 2ndQuadrant does produce the product Barman manning for doing backup and recovery scenarios and does handle this type of solution but you can go ahead and set this up for yourself if you'd like. So definitely a complication to be aware of. Now, further down they start going into logical replication and looking at it because in this section drinking from the fire hose, he's basically saying when you're doing physical streaming replication you're doing a binary replication of what's on the primary to a replica. But if you want to, as he says, take a sip instead, basically only replicate from a logical perspective tables and data changes. You could do that using logical replication.

But again, when you're using logical replication the problem you run into is that the LSN replay position is not kept up to date on all the replicas. So if you have replication slots set up on your primary database and you need to fail over well, that slot information does not carry over to the replicas. He says it's been this case since 9.4. So you basically have to recreate those and then get your logical replication up and running again. He also mentions some issues that you can't actually rewind. Logical replication you can only move forward which could also cause some issues. Then another complication he mentions is that the origin Postgres instance that you're replicating from is actually used to decode the WAL information via the local system catalog.

So in other words, it's decoding this into something like a table named foo.bar. So basically one thing he says to do to kind of try to avoid some of these problems is to keep a sufficiently large WAL keep segments. Even though you're using replication slots that are supposed to ensure you don't lose any WAL it's important to still keep some segments around to be able to handle issues such as these. Now, ideally, to solve this he believes Postgres needs a mechanism for relaying replication slot information to standby nodes. Definitely true. Secondly, quote "Some way of adding extra WAL content such as object names that exist in a decoded form within the WAL file itself, or as a supplementary transaction pinned decode mapping".

So basically, being able to not have to translate those names on the origin or on the publisher but allow the subscriber to do it from the WAL itself. Now, again, this 2ndQuadrant has a product they call pglogical3 which is an extension and he says it can be configured to regularly copy slot positions. But this is only available for commercial customers. So the community version of Postgres does not do this. But this has been a review of things you can run into when you're using replication both physical and logical in PostgreSQL and I definitely encourage you to check out this blog post.

The next piece of content is "How one word in PostgreSQL unlocked a 9x performance improvement". This is from James Long at jlongster.com and he's describing an application he developed that consumes messages for doing syncing. When one user tried to do 169,000 messages in one day, basically it caused a huge number of issues. So he looked into optimizing it. All of these messages get inserted into a Postgres table that looks like the structure here so relatively simple and he wants to avoid duplicates. So his insert is using an ON CONFLICT DO NOTHING and he was basically inserting one row at a time based on how many messages there were. Now, the first thing he discovered is that you can do multirow inserts so you can send one INSERT statement that inserts multiple rows. This is definitely more efficient than doing an insert at the time, particularly if there's network latency.

But even without that, it is much faster to insert a large number of rows than doing one row at a time. So that's the first optimization he did. But the concern was this is because he was doing ON CONFLICT DO NOTHING, how would he know what was inserted and what was not inserted? So here this refers to one word he's using RETURNING to return, essentially his primary key: a timestamp. And what's great about this is that it only returns that timestamp on successful inserts which is exactly what he wanted. Because of this he basically got a ninefold performance improvement with these two changes, which is pretty great. Now, he tested up to 40,000 in this post, but for 169,000, he was running into other areas of his application that prevented that from working successfully. But I imagine at this point you can just break down that work. So if you want to look through an interesting story and how we worked through it and found optimizations he could use for his application, definitely check out this blog post.

The next piece of content is "Understanding user management in PgBouncer". This is from 2ndquadrant.com and they're talking about how you set up users in PgBouncer. The typical way you do it is there is a file called userlist.txt where you list out each user and their password. Now this not only authorized users to connect to the PgBouncer but then also sends that information to the destination PostgreSQL instance to grant access to it. Essentially the auth_file is what it's called internally, but it's basically named userlist.txt. Typically you can rename it, he says, so this has been the most common way to set up. However, you can also do a query so you can query a PostgreSQL instance to get that list of users.

Here, the default value of auth_query in PgBouncer is getting the username and password from the pg_shadow database and he describes how you could set this up to of course be more secure. He also talks about how you could use PAM in order to do it as well or potentially LDAP. He says another scenario is that you can set up Forced users so that in your actual connection to the PostgreSQL database, separate from the users connecting to PgBouncer, you can hard set a username and a password and they need to review some things about which to use. Generally, the auth_file is definitely the more traditional setup, but if you have a lot of users that could be a little bit onerous and you may want to resort to the off-query method for example. So if you want to learn more about managing users in PgBouncer, check out this blog post.

The next post is "PostgreSQL 12.3". This is from jepsen.io and they're doing an in-depth analysis of transaction isolation in PostgreSQL. So this is quite a long post, but the summary listed here is that when using serializable isolation under normal operation, transactions could occasionally exhibit G2-item which is an anomaly involving a set of transactions that roughly speaking mutually fail to observe each other's rights. So basically there is a bug in PostgreSQL that they've discovered with regard to serializable isolation and they mentioned that a patch for the bug we found is scheduled for a minor release on August 13. So good news. They also mentioned that the repeatable read isolation is actually snapshot isolation. So it's a little different than some other databases' behavior and they advocate updating the documentation to reflect this. So if you want to check out this in-depth analysis of PostgreSQL, definitely check out this blog post.

The next piece of content is "Webinar: Understanding the PostgreSQL table, page layout [Follow Up]". This is a webinar on 2ndquadrant.com. You can get access by clicking here to the webinar and it basically goes into the internals of Postgres. How databases are laid out in the file system, how each page on the file system is laid out, and how data gets inserted and updated or deleted. How does it handle when rows are too long, and how does the TOAST system work. So if you're looking to learn more about the internals and how Postgres lays out data within the file system, definitely check out this blog post.

The next piece of content is "SQL TRICKERY: HYPOTHETICAL AGGREGATES". This is from cybertec-postgresql.com. So basically they're asking the question of if you had this particular data, where could it go in a particular number series? So here they are doing a query that generated this series of aggregates and split it into two rows. Then ask the question if you had a number, say 3.5, where would it fall in the order to rank here? So if you look at this series of numbers, a 3.5 would fall after the two. So it would be in the second position or a 3.5 in this series of numbers would fall after the three, so that would actually be in the third position. So this seems to be definitely a unique use case, but it is using window functions to kind of hypothetically see where something would fall in a range. So if you're interested in that, check out this blog post.

The next post also from cybertec-postgresql.com is "WRAPPING DB2 WITH POSTGRESQL". This is basically referring to the DB2 Foreign Data Wrapper, so they show how you get that set up to be able to query and get information from the DB2 database.

The next piece of content is "Oracle to PostgreSQL: Basic Architecture". This is from 2ndquadrant.com. So for those of you who are migrating from Oracle to PostgreSQL, they're talking about the different terminology used between Oracle and PostgreSQL. So for each of the different components that you would typically talk about, they look at the Oracle term versus the PostgreSQL term in terms of services and what tasks need to be done. They show the Oracle term and then what the Postgres term is. Then finally, with regard to data and the different components, they show the two different terms for each database system. So if you're going to be converting from Oracle to PostgreSQL, definitely a blog post to check out.

The next piece of content is "Release Notes for Citus 9.3, the extension that scales out Postgres horizontally". This is from citusdata.com. So basically this is a new release of the Citus extension that allows you to scale out your PostgreSQL installations. Seems like the biggest feature improvement for this version is full support for window functions. They also made some improvements to shard pruning, INSERT... SELECT with sequences, and then some support for reference tables on the Citus coordinator. I believe that's a node that coordinates the interactions of the cluster. So if you want to learn more about the improvements for the new version of Citus, definitely check out this blog post.

The last piece of content, the PostgreSQL person of the week is Tomas Vondra. If you want to learn more about Tomas and his contributions and work in PostgreSQL, definitely check out this blog post.

Replication Complications, One Word Performance, Pgbouncer Users, Serializable Isolation | Scaling Postgres 119

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript