background

PGCon Online, Performance Tips, Tracking Counts, Essential Monitoring | Scaling Postgres 115

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss PGCon going online, application performance tips, ways to track counts and essential areas to monitor.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. But our first piece of content is "Postgres 13 Beta 1 Released!". This is from postgresql.org. This essentially goes over some of the release notes, talking about the new functionality that's being added, and talking about different administration, enhancements, security, and other highlights. So if you're interested in trying Postgres 13 before it's released, go ahead and give Beta 1 a try.

The next piece of content is that PGCon, which is a PostgreSQL conference for users and developers, is actually going online this year. From what I've seen, there is no fee. So they say there will be no charge for this conference and everything is free, thanks to the amazing sponsors that have helped put this on. So if you have some free time for the talks that are going to be taking place online on the 27th and 28th of May, 2020, as well as tutorial day on the 26th, definitely check out this link to participate in the conference.

The next piece of content is "Postgres tips for optimizing Django & Python performance, from my PyCon workshop". This is on the citusdata.com blog. So essentially these are performance tips. Now, this is specific to Django and Python, but really the guidance applies to any application framework when I took a look at it. So they're talking about what to look for when you're looking at your logs in a development environment, monitoring pg_stat_statements to be able to track what statements are doing, and how to avoid loops like N+1 queries. Again, this is specific to Django, but you're going to want to do this in other application frameworks. Limiting the columns returned to avoid slow SQL queries and pagination tips. In other words, don't use offset and limit to do those types of queries, but specify the ordering using an order by. So if you want to get insight into all of this, this was posted at this YouTube link that you can view right here to get these performance tips. So if you're interested in that, go ahead and check out this post and video.

The next piece of content is "HOW TO COUNT HITS ON A WEBSITE IN POSTGRESQL". This is from cybertec-postgresql.com. Now, I thought this was talking about hits on a website, but actually, if you look at the details, he says, "Suppose you want to grant a user access to a certain piece of data only X times. How can one implement that safely?". So I was a little confused with that given the title. It looks as if what they're trying to achieve is that someone wants to only allow a certain number of hits to a website and if it exceeds it, basically it puts a limit on it, or if they know if they go over the limit. So it seems to be kind of an unusual use case. But this post goes over a lot of different techniques to kind of achieve this use case that I thought was interesting and you should check it out. 

Now, I'm not going into detail, but I'm just going to highlight some of the things that he goes through so you can get a sense of different techniques. For example, in his implementation of going about this, he's using a B-tree GiST index because normally you can't combine binary tree index queries with a GiST query, but with this extension, I believe it allows you to do it. He's using range types to be able to set some of the limits on how many clicks per day are allowed per customer ID. So this Customer Limit table defines the limits. Now he's also logging all of the web traffic so every hit, regardless of being tracked as the number of clicks per day, he is recording all of the log entries and then he has a log count table. So basically this is a summarization of what's in the log table by Customer ID by day, and he has this counter_limit. This is what is allowed for that customer. 

But then if somebody exceeds the limit, then this counter_limit_real records that. And again, he has a unique constraint on customer ID and date. And he's using this FILLFACTOR here to not fully fill the table and allow updates to happen with better performance because maybe they will be placed on the same page. Now, the meat of the action takes place all in this function here where he's using this CTE to basically insert a row in his t_log table so everything will get logged in that table, but then it will try to do an INSERT or an UPDATE. He's using ON CONFLICT to do the update into the t_log_count table. So again, a lot of different techniques are used in this use case. Where I haven't really seen a parallel to something like this. This blog post uses a lot of different techniques that you may want to check out and use some of them for your own application.

The next post is "SQL Window Functions Cheat Sheet". This is from learnsql.com and they have this convenient two-page sheet. I don't use window functions regularly and frequently I have to look up when I want to do one and having this cheat sheet available makes a lot of sense. In conjunction with this cheat sheet, I actually go into a blog post that covers all the different items listed in it. So if you want a helpful reference to be able to remember how to do certain window functions, maybe check out this post.

The next post is "Essential PostgreSQL Monitoring - Part 1". This is from pgdash.io. So they're going over in this part one essentially nine things that you should be monitoring. The first is the Transaction ID range or basically, what's your max txid. You don't want it to exceed 2 billion of course. They give you the code to check that. Second is monitoring the number of backends and you can find that just by doing an account on the pg_stat_activity table.  Third is tracking the number of inactive replication slots because if you have an inactive replication slot, then the WAL won't be archived so that's very important to deal with these orphan or inactive slots. He shows you the query. Just query pg_replication_slots where it's not active. Fourth is checking on backends that are waiting on locks. So again, you can see whether things are locked up by checking the pg_stat_activity table and looking where the wait event equals lock. 

Number five is backends idling in transactions. You want to avoid time being spent idling in transactions and you can check that just by looking at the pg_stat_activity table. Six is checking the replication lag for active connections. So you can see how much of a lag you're running with when you have replicas and that's in the pg_stat_replication table. You can use these to give you the right flush and replay lag. Seven is tracking the replication lag for replication slots. The last two are checking the WAL file count first of all the number of wall files in the pg_wal_directory itself and then also in the Destination Archive directory and they give you for the different versions the commands to check that. So this is a really great list of the number of things to monitor that you can get access to by simply querying the Postgres system tables. So if you're interested in learning more about this, check out this blog post.

The next piece of content is "Missing Piece: Failover of the Logical Replication Slot". This is from percona.com and they're talking about a problem when you have a High Availability pair of Postgres instances. So you have your primary and you have a physical HOT standby. So you're doing streaming replication to the standby. Now if you have a logical replication enabled to another instance, the problem is what happens when the failover happens because there's no information about the logical replication slot that's on the primary here. If there's any delay at all with the replication, you could be missing some data once you activate the slot on this side here. So there's no state transfer of the state of the logical replication slot from the primary to what will be the failover. 

They're saying essentially, this is a problem if you want to maintain data integrity. They talk about different designs and discussions for solving this problem. Unfortunately, up to this point, they said it's been addressed as early as 2016 in a blog post as they mentioned here, quote "There's no generic way to solve this for everyone". So essentially, this problem still exists because it's been difficult for people to find a way to handle it to the satisfaction of everyone. So it's a lot of discussion, but nothing has happened as of yet. They talk about some possibilities to address this. So this blog post is important just to bring this issue to the forefront, particularly if you're using logical replication along with a High Availability Postgres pair. But even if you're not trying to operate in a High Availability mode, the fact that you have a physical replica, if you're using logical replication, you're going to have to figure out the best way to deal with this. And they talk a little bit about it, but it's definitely something to be aware of and to plan how you want to mitigate this potential issue.

The next post- "Is it Time to Upgrade your PostgreSQL Database?" This is from enterprisedb.com and they go over why you should upgrade and the benefits of it. I actually liked seeing some of the information that they presented here, particularly looking at this graph where they ran pgbench performance for different versions from 9.5 up to the current 12.1. And you could see the greater transaction performance with an increasing number of clients across the different versions. Now, it's not only for performance reasons but there are a number of other features, of course, that have been added that could greatly make your lives easier if you upgrade to a more recent version. Then they also talk about what are the risks of not doing an upgrade, particularly if you're staying on an older version. And then they talked about some different ways that you can do the backup and the best, and which is best suited for what. Generally, I use the pg_upgrade for doing the databases to minimize downtime. But if you're thinking whether or not you should upgrade, definitely check out this blog post.

The next piece of content is "Phoney table columns in PostgreSQL". This is from highgo.ca. Now, I don't know if I'd call these phony table columns, but they're basically talking about system columns in a table that is essentially used to manage the database. They aren't visible. If you look here, they created a one-column table and the field name, they called a field. And then they queried the pg_attribute table and you can see the field. The attribute number is one and the Type ID is an integer. But then you can look at all the different columns that it adds to manage essentially this data stored within each row. So you can see the tableoid, cmax. xmax, cmin, xmin, and the ctid as well. 

Then they go through and discuss the purposes of each of these hidden columns, essentially system-managed columns for each of the tables. Then they interestingly came up with some use cases where you could actually use these to do some queries. Like for example, they were using the ctid column to eliminate duplicate records. They have a scenario where you could use this system column essentially to do that. So if you're interested in learning more about these system-managed columns and how they could potentially be used, definitely check out this blog post.

The next piece of content, the PostgreSQL person of the week is Dimitri Fontaine. So if you want to learn more about his work and contributions to Postgres, definitely check out this blog post.

The last piece of content is "How to use Logistic Regression Machine Learning model with 2UDA- PostgreSQL and Orange (Part 5)". This is from 2ndquadrant.com So this is of course the fifth part about machine learning. So if you're interested in learning more about that, check out this blog post.

episode_image