New Releases, Reconnect After Failure, Bablefish, Troubleshoot Disk & Memory | Scaling Postgres 152
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss new PostgreSQL releases, reconnecting after a connection failure, Babelfish SQL Server compatibility and troubleshooting disk and memory issues.
Content Discussed
- PostgreSQL 13.2, 12.6, 11.11, 10.16, 9.6.21, and 9.5.25 Released!
- Reconnecting your application after a Postgres failover
- Babelfish: the Elephant in the Room?
- Troubleshooting Performance Issues Due to Disk and RAM
- Starting with Pg – where is the config?
- Why is there database named “postgres”?
- Exploring SQL command changes in PG-14
- PostgreSQL TOAST Data Corruption (ERROR: unexpected chunk number)
- Deep PostgreSQL Thoughts: The Linux Assassin
- Kubernetes Pod Tolerations and Postgres Deployment Strategies
- Patroni Environment Setup: PostgreSQL High Availability for Windows
- “Catchment areas” with PostgreSQL and PostGIS
- check_pgbackrest 2.0 has been released
- Hubert Lubaczewski
YouTube Video
Podcast Audio
Transcript
All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PostgreSQL 13.2, 12.6, 11.11, 10.16, 9.6.21, and 9.5.25 Released!". This is from postgresql.org and the reason for the release primarily are two security issues. One is a partition constraint violation errors leak values of denied columns. So basically it looks like some columns can leak through on certain operations that someone shouldn't have permission to view. Also single column select privilege enables reading all columns. So again, kind of column leakage.
This latter one looks like it only affects PostgreSQL version 13 and the other between versions 11 and 13. But overall there's been a number of bug fixes and improvements that have been done to all the releases. It's always good to look over these because some of them require additional work. So for example, there was an issue that was fixed with just indexes and then they suggested you reindex those just indexes. Or there were certain cases where CREATE INDEX CONCURRENTLY had issues. If you're using prepared transactions and if you are an installation that uses those, then you should reindex any concurrently built indexes previously. So go ahead and consult this post to look at all the different changes that have been made and resolved in Postgres.
The next piece of content is "Reconnecting your application after a Postgres failover". This is from citusdata.com and they're talking about how when you're setting up a Postgres high availability solution, one thing that you should also consider in addition to the server infrastructure is your client or your application side. Make sure that it can handle those disconnects. When a primary server fails over to a secondary server, you are going to get dropped connections, it doesn't transparently handle it off. So your application needs to handle that. It discusses some techniques for doing that and even gives you a way to do some testing. But in terms of some tools that you can use, one is if you're not using something like Pgpool or PgBouncer that has some techniques to handle transitions between servers, you could use the multi-host connection strings that are part of the libpq library.
Now, even if your application doesn't use libpq directly, it may rely on libraries that do use that. So for example, they're mentioning that the Postgres drivers in Python, Ruby, PHP, or even node JS actually use libpq under their standard libraries. But there are others that use a more direct implementation and you need to check if they support these multi-host connection strings. Basically, it enables you in the host connection string to be able to specify multiple nodes and it will try one after the other to make a connection. So if Node 1 is the primary and it goes down, if it tries to reconnect and it's unable to reach Node 1, it'll try Node 2. In addition, you can also set target session attributes and you can specify that this particular connection for this client should be read-write. So if one of these nodes it connects to only supports read-only connections, it will then go to the next node.
So this is a simple way you can set up a way for a client to be able to try and find the PostgreSQL primary, say, after a failover. Now, of course, they did mention you can use PgBouncer and Pgpool, and there are a number of different ways that you can handle that. But then what it talks about is a way to have your application prepared to handle those types of transition failures. Now, Citus Data has developed a pg_auto_failover, which is an open-source solution for doing auto failing over or a high availability solution for auto failing over your Postgres instance. They've actually packaged it up into a docker container that enables you to run it very simply in your development environment and test out how your application behaves with a pg_failover with this. Now, of course, you wouldn't use this docker container in production, but it gives you a basis to do some testing to make sure that the client side of the application can handle these types of transitions. But of course, you can choose to implement pg_auto_failover for your production area or use another solution. So if you're interested in learning more about this and the capabilities of pg_auto_failover, definitely check out this post.
The next piece of content is "Babelfish: the Elephant in the Room?". This is from postgresql.fund, and this is a discussion piece about the presence of Babelfish. So Babelfish is quote "...an endpoint to PostgreSQL that understands the SQL Server wire protocol Tabular Data Stream". So it can interpret things like TSQL commands and things for SQL Server. I love this quote that he says here: "What this means is that Babelfish will be able to 'impersonate' an SQL Server database. Applications may be able to run unchanged, believing that they are connecting to SQL Server, when they will actually be connecting to PostgreSQL Babelfish".
Now, from my understanding, AWS developed this for their Aurora database, which is based upon Postgres but is not Postgres, but they have pledged to open-source it. This post talks about that possibility. Basically, it's trying to rally the PostgreSQL community to help support this. To try to implement it as an extension to Postgres and make changes to Postgres to more easily support this type of extension that gives you this SQL Server compatibility because he doesn't want to get into this situation where they have to create a fork of Postgres to support this or they just use it for Aurora. I suppose and potentially don't offer it for PostgreSQL. So I found this a very interesting post and I think you should take time to review it to understand kind of his perspective and why he thinks the PostgreSQL community should help and assist Babelfish to be able to work better with PostgreSQL. So if you're interested you can check out this post.
The next piece of content is "Troubleshooting Performance Issues Due to Disk and RAM". So this is from highgo.ca and it basically talks about a number of tools you can use to diagnose performance issues as they relate to say, disk or memory issues. So the first thing he mentions is the top command in order to check out CPU utilization, check out memory using the free command, and the disk space with the df command. But if you want to look more into disk performance, he mentions three other tools here: dstat, iotop, and sysstat, as well as a few other tools such as iostat, sar, dstat, and also reviewing the /proc/meminfo to get more detailed information from what Free gives you. So if you want to learn some more Linux tools to help you diagnose disk or memory issues as they relate to operating Postgres, you can definitely check out this post.
The next piece of content- "Starting with PG- where is the config?". This is from depesz.com and he's talking about how you've been introduced to a new Postgres database and you need to find out where its configuration is. By configuration, he basically means the postgresql.conf file, the pg_hba.conf file, and maybe the pg_ident.conf file. Now if you have access to a psql terminal, you can do SHOW config file and it will show you the file location of where that configuration file is at. Or you can do SHOW hba file or SHOW ident file and it gives you the directory location, he says. But what if you don't have access to a psql prompt?
Well, you can actually take a look at the process running on the server. So if you do a ps command, you can actually pull out the running Postgres process and it tells you where the data directory is located as well as the configuration file. So you can use that to be able to see what its configuration is. Now the other thing to keep in mind is that there is also the postgresql.auto.conf file and there may be some configuration in there that's typically located in the root of the data directory. So if you want to learn a little bit more about where to find different configurations in Postgres, you can check out this post. configurations
The next piece of content is also from the same website, depesz.com, is "Why is their database named 'Postgres'?". Basically, when you bring up a cluster, it comes with three database templates 0, 1, and Postgres. So template 1 is basically the template from which all new databases are created and you can make alterations to this template that will then be implemented in the databases you create. Now, template 0 exists in case anything happens to template 1. So you can recreate template 1 from template 0.
But of course, what is this Postgres database? Because if you look in it, do a \D, it has no objects in it, and basically, it's described as a way just to be able to connect to some database, to execute some commands, even to create a new database. So you can connect to the Postgres database to be able to generate commands. Sometimes if you connect to template 1, then there's certain commands that don't work and he does some different tests connecting his different users so you can see what the different behavior is. There are also some comments below as well that discuss it. So if you're interested in that, you can check out this blog post.
The next piece of content- "Exploring SQL command changes in PG-14". This is from highgo.ca, and they're listing different changes at the SQL level that have happened to Postgres. The first is that the AS keyword to give a new name for a column is now no longer necessary. So before you had to say "select loc as analyze", now you can just say "select loc analyze" and it will interpret that in Postgres 14 correctly. So you no longer need the AS keyword. It's optional, although there are some that still need to be explicitly set because they're reserved keywords. The next is adding support for leading/trailing trimming of byte array values. So that's a new feature. Allowing the current role in the GRANTED by command. Allowing or replace when you're working with triggers. So before you had to drop a trigger and then create a new one, well, this now allows create or replace a trigger, which is a great addition. Also support for reindexing concurrently on partition tables, which it did not do previously. So if you're interested in some of these SQL changes coming to Postgres 14, definitely check out this post.
The next piece of content is "PostgreSQL TOAST Data Corruption (ERROR: unexpected chunk number)". This is from fluca1978.github.io. Now he's describing an issue where data has been corrupted. Now I personally haven't seen this and he actually says down here that quote "...so far I've only experienced human-caused damages". So PostgreSQL itself while running hasn't experienced TOAST corruption, but apparently someone has gone into the data directory of Postgres and has altered something or some program has run and impacted the data directory. That's where he sees these corruption changes. Well, he's developed a function that can actually identify and find out where these are and he's placed it in a GitLab repository here. So he goes through the process of testing it out and actually corrupting a database. So definitely do this in a test system. But having done that, you can then use the function to be able to identify the data file that has the issue. So if you're interested in that, you can check out this blog post.
The next piece of content- "Deep PostgreSQL Thoughts: The Linux Assassin". This is from blog.crunchydata.com and he's basically talking about the out-of-memory killer. There are various different configurations that you do to try to avoid the out-of-memory killer on your Postgres system. One of the main ones is setting the overcommit memory to 2 and then adjusting your overcommit ratio as well to make sure you're using the optimum amount of memory. Failing that, there is a way to change the out-of-memory score adjustment, so that's also a possibility.
Then it goes into there are host-level out-of-memory killer mechanics as well as CGroup level, so you can apply some of these out-of-memory parameters at the CGroup level. Then he talks about why it's important to avoid this and basically, you don't want to bring down your database or require restart because that has its own host of issues. He goes into more detail about a lot of this here. But the post then turns to Kubernetes and how Kubernetes actively sets vm.overcommit_memory to one so it doesn't turn it off as the setting of 2, which is the recommendation.
It also has different CGroup out-of-memory behavior and you can't really change the score adjustment. Where swap memory can help reduce some of this over-committed memory by using swap, he says that Kubernetes turns off the swap. So running Postgres in Kubernetes has some particular challenges and he goes into detail with some of these and gives you a few options on mitigation. He also says that they're actively working to try and make these issues minimized in future versions in the future. So if you're interested in this content, you can check out this post.
The next piece of content, also from blog.crunchydata.com, is "Kubernetes Pod Tolerations and Postgres Deployment Strategies". So if you want to learn a bit more about Kubernetes and its new Pod Tolerations capabilities with regard to Postgres and their upgraded PostgreSQL operator, you can check out this blog post.
The next piece of content- "PATRONI ENVIRONMENT SETUP: POSTGRESQL HIGH AVAILABILITY FOR WINDOWS". This is from cybertec-postgresql.com. It discusses how you can use a graphical installer to set up Patroni on Windows. So if you're interested in that, you can check out this post.
The next post, also from cybertec-postgresql.com, is "'CATCHMENT AREAS' WITH POSTGRESQL AND POSTGIS". So if you want to learn more about these catchment areas, definitely check out this blog post.
The next piece of content is "check_pgBackRest 2.0 has been released." This is from pgstef.github.io. This is a monitor for the pgBackRest backup tool, so if you're interested in learning more about this, you can check out this post.
The last piece of content, the PostgreSQL person of the week is Hubert Lubaczewski. So if you're interested in learning more about Hubert and his contributions to Postgres, definitely check out this blog post.