Delayed Replication, Materialized View Permissons, Paranoid Postgres, Memory Overcommit | Scaling Postgres 173
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss how to delay replication, working with materialized view permissions, paranoid configuration options and addressing memory overcommit.
Content Discussed
- Simulating temporal tables with dblink and replication delay
- Postgres Permissions and Materialized Views
- Paranoid SQL Execution on Postgres
- PostgreSQL, Memory and the Cloud
- PostgreSQL Partitioning in Django
- Announcing Credcheck extension to enforce username and password checks in PostgreSQL
- The Next Generation of Kubernetes Native Postgres
- Getting Started with PGO, Postgres Operator 5.0
- Disaster Recovery Strategies for PostgreSQL Deployments on Kubernetes (Part 2)
- A quick sanity testing of pgpool-II on ARM64
- Daniel Westermann
- Rubber Duck Dev Show Episode 2 | Software Development Technical Project Management
YouTube Video
Podcast Audio
Transcript
All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "SIMULATING TEMPORAL TABLES WITH DBLINK AND REPLICATION DELAY". This is from cybertec-postgresql.com. They're talking about simulating the ability to do time travel with tables, basically being able to go back in time and see what state a particular table was in. Now, of course, the easiest way to do this is to just work with insert-only or append-only tables. So you'd have to construct your data such that nothing is ever deleted or updated, but you just basically have inserts. You can go back at any point in time using the dates on the table. But apart from that, they did come up with an interesting technique to do this kind of time travel thing using replication delay as well as dblink. Now, the feature that applies a delay is called recovery_min_apply_delay.
What that does is you can set it up on a replica so that it doesn't follow the primary database server immediately, but it delays itself by however many hours you choose. It could be an hour, 2 hours, or 24 hours. So this is a great feature to recover data if you had a replica that was a couple of hours delayed. Because even if you have a primary database with multiple replicas, what happens if you accidentally drop a table or accidentally delete some data with a delete statement? Well, if you have a large database, it's going to take a while to do a full restore of that appointed time recovery up to that point. But if you had something that was a little bit delayed, perhaps you could go into that delayed copy and pull the data out you needed to reinsert it if you wanted to do that. So it's a very interesting feature for specific use cases, but here in this example, they're using it for this point.
Because that's how they're doing the time travel or the time delay because they're going to be contacting that replica. And how they're doing it is using dblink. Now, this is something that existed for one Postgres server to talk to another Postgres server before the advent of Foreign Data Wrappers. So normally you would use Foreign Data Wrappers, but because of what they set up here, apparently, it didn't really work. But they used the older fallback dblink to do it. But they do describe how they set it up and then the relative performance of it. But mostly, I wanted to highlight the replica apply_delay feature, as well as mention dblink still exists in addition to the newer Foreign Data Wrapper. So if you're interested in learning more about some of these, you can check out this blog post.
The next piece of content- "POSTGRES PERMISSIONS AND MATERIALIZED VIEWS". This is from rustprooflabs.com, and they're talking about when you have a materialized view and it needs to be refreshed. It needs to be refreshed by the owner or a superuser of the system. Sometimes, this can be a bit inconvenient because maybe the user that created it is not present and it needs to be refreshed and you don't have a superuser around to do it. Basically, you want general users to be able to refresh it. You do that by assigning permissions to a specific view and granting those permissions rights to other users. This blog post walks through all how to do it. So first he created a superuser role using the WITH LOGIN SUPERUSER, and he created a materialized view with it.
Then he created a general user role and grants SELECT on that materialized view to the user and now he can query it just fine, no problem. But when he tries to refresh the materialized view as that user, it fails because an error in it says must be the owner of the materialized view. So what he does here in this example is he reassigns ownership from a superuser to another user that he creates, or I should say another role that he creates. So he creates another role, the owner of the materialized view role, with no login means they can't log in, it's only a role. He alters the materialized view to assign ownership to that role. Then he grants that role to his user role. When he sets his role to be my_user and refreshes the materialized view, it works. If you have the need to set up certain users to refresh materialized views, maybe you want to check out this blog post.
The next piece of content- "Paranoid SQL Execution on Postgres". This is from ardentperf.com. He's talking about a few things you may want to set up on your Postgres database system to be able to protect it from things like rogue queries or things accessible locking it. I'm not going to go through the full list here, but I'll just point out some highlights. I think probably the most important ones are the second and third that he lists. So the most important one in my opinion is probably this lock_timeout. Because whenever you're doing any sort of DDL statement, you want to make sure that while it's waiting to acquire a lock, it's not going to back up other queries, at least for an excessive amount of time. I've seen lock_timeouts from 2 to 5 seconds. It depends on how active your database system is. You don't have to have this running all the time, but when you're ready to do DDL changes, it's usually a good practice to have that. The other is to have a statement_timeout.
Now, I probably wouldn't put this on the database as a whole or if you do it needs to be incredibly long because this could cancel, for example, backup jobs or vacuum jobs but applying it to the roles that are being used is a good practice or in particular sessions. Maybe you want to adjust the statement_timeout. So this is another very good one to keep your database connected. Now they also mentioned things like connect timeouts which could help, and a number of other things I'll let you review if you're interested in protecting your database. But in my opinion, it's basically watching out for those locks. That's the thing that's probably going to do you in more than anything else, which is why I think the lock timeout is so important. He does mention below here, watch what gets locked when you're doing DDL statements. So based on the version they've been making improvements so that you can change more things with DDL statements without creating long-running locks. But that's a thing you need to be aware of in order to protect your database; what are you changing and how is the database system being locked? But if you want to learn more, go ahead and check out this blog post.
The next piece of content- "PostgreSQL, Memory and the Cloud". This is from sosna.de and they're talking about memory overcommit and they're talking about the scenario where basically modern Linux systems give more than they have in terms of memory to different applications. So that puts them in a position where if an application actually does use all of that memory, it may get into a low memory state and then have to do certain actions to take care of it. So for example, they could panic or halt the system, they could freeze the requesting program, terminate the program, or use an out-of-memory killer to terminate specific processes. And this is the thing that you hear a lot with Postgres is that you don't want this running and terminating certain Postgres processes. Now they give some examples like if you have a desktop operating system, this makes sense.
If something requests too much memory, you're okay killing that particular application that may be running, or an application server. Maybe it's one process that has to be killed that's taking up too many resources. But with the database, it can cause some more problems. On Linux, what they recommend is setting the vm.overcommit_memory to 2. Now you may also want to adjust your overcommit_ratio as well. So this is definitely the recommendation but I have seen issues when there is no swap space. A lot of cloud vendors don't provide swap space on their disks. I have seen some problems when setting this overcommit_memory to 2 in the case where there's no swap file. So you do have to be a little bit cautious when setting this if you're operating in a cloud environment with no swap file. Now this post talks about the cloud environment and how some of these settings may not be made.
So for example, with overcommit set to 2, it just basically should not overcommit memory. So in this normal operation, things consume memory. When no memory is left, future memory allocation should fail and basically Postgres rolls back that one query, so it shouldn't bring down the whole system, whereas in some cloud environments, they may not have this setting set and an out-of-memory killer is triggered. Because some processes use shared memory, it means bringing the whole system down. So basically it results in a restart of the whole PostgreSQL system. So definitely not ideal either. So it's definitely something to keep in mind when you're configuring your Postgres systems. I would probably set the overcommit to two, but then adjust the overcommit ratio to make sure that you're getting good memory allocation. But if you want to learn more, go ahead and check out this post.
The next piece of content- "PostgreSQL Partitioning in Django". This is from pganalyze.com and they're talking about partitioning your tables in your database for use with Django in the top here. They did do a test that they described lower down below, but they were seeing performance improvements when data had been partitioned. Now, this is a greater improvement than I've seen in my implementations, but it does stand to reason you should receive some performance benefit. But I think the bigger significance is that the maintenance of tables is much easier when you're partitioning. So if you have a billion+ row table, it's much better to break that down into multiple partitions so that they will be much easier to vacuum. Then if you ever have to delete data, it's much easier to, say, delete older data if you need to by just dropping a table. Now, they say when you get to a million or more records, you may want to consider partitioning. I think you could push that out to potentially a billion.
I mean, it depends on how much data is in each row, but I think you could easily push it to the 500 million 1 billion record mark before you do partitioning. The rest of this post talks about doing list partitioning and range partitioning, hash partitioning, how it works, and how you get it set up. Basically, you create a parent table that has no data in it and then each of the partitions is what contains the data and it's partitioned based on how you have it set up. So here they just did it by a simple date range and they demonstrated the code to do that. They then used a faker library to create a bunch of data and that's how they did the tests from the graph that appears at the top. Then for those using Django, they suggest some additional libraries to help work better with Django and partition tables. So if you want to learn more about how to use partitions in Django. Definitely check out this blog post.
The next piece of content- "Announcing Credcheck extension to enforce username and password checks in PostgreSQL". This is from migops.com. Now, PostgreSQL has a check_password_hook and they have leveraged this to develop an extension they're calling Credcheck which allows you to specify complexity standards for the username and the password. So for example, a username such as mysimpleuser would be excluded, whereas this one of this complexity would be allowed, or something that just says secret would be disallowed for a password, whereas this would be considered a good password. They have all of these different standards for defining the complexity of the password in this extension.
Now, this new extension does have some prerequisites. You need to be using at least version PostgreSQL 10, and you need to have the development package library, such in the case of Ubuntu, it's the postgresql-server-dev-version and you do need to go ahead and compile it from source and they describe how to do that here. Once you've done that, you can actually run the CREATE EXTENSION command, add Credcheck to your shared preload libraries, and then restart your database system. Then you enforce those standards by updating the postgresql.conf file. Or you could use the ALTER SYSTEM command which updates the postgresql.auto.conf file. So this is a very interesting extension to help you give username and password standards for your database system. So if you're interested in that, check out this blog post.
The next piece of content- "The Next Generation of Kubernetes Native Postgres". This is from blog.crunchydata.com, and this post announces that they have released Postgres operator version 5. They said this new version really embraces a declarative approach as opposed to an imperative approach. Basically, you define what you want to have happen and it basically does it for you, as opposed to you having to explicitly set up all the different parameters.
Now, related to that, they have a second post that says "Getting Started with PGO, Postgres Operator 5.0". Here, they show you how easy it is to get up and running using Kubernetes and a Postgres cluster. So if you're interested in that, definitely check out these blog posts.
The next piece of content- "Disaster Recovery Strategies for PostgreSQL Deployments on Kubernetes (Part 2)". This is from b-peng.blogspot.com. Now, the previous post was talking about using the Crunchy PostgreSQL operator, but this new version talks about Zalando, so it's a Zalando Postgres operator. So if you're looking to learn to use Postgres and Kubernetes, maybe you want to check out this post as well.
The next piece of content- "A quick sanity testing of pgpool-II on ARM64''. This is from amitdkhan.bogspot.com. As more software is being developed for ARM, this post explores using Pgpool II on it. So if you're interested in his results, you can check out this blog post.
The next piece of content. The PostgreSQL person of the week is Daniel Westermann. So if you're interested in learning more about Daniel and his contributions to Postgres, definitely check out this blog post.
For the last piece of content, I will mention that we had our third episode of The Rubber Duck Dev Show last Wednesday and the topic was the people side of software project management. So I'll include the link down below if you want to check out that episode.