Observer Effect, Partition Management, Tuple Freezing, Hung Transactions

In this episode of Scaling Postgres, we discuss the observer effect with explain analyze, partition management, tuple freezing and sources of hung transactions.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "EXPLAIN ANALYZE MAY BE LYING TO YOU". This is from ongres.com. They're talking about something that in physics is called the Observer Effect, meaning that the process of observing something may actually alter the actual outcome you're trying to observe. Specifically, they're talking about EXPLAIN ANALYZE. They go over an example here where they use generate_series to populate two different tables. They then EXPLAIN ANALYZE the sum from these two tables and use psql timing or pg_stat_statements to look at the queries and compare them to what EXPLAIN ANALYZE is reporting. After 20 runs using EXPLAIN ANALYZE, they notice significantly slower runtimes compared to just running the query without EXPLAIN ANALYZE.

As a reminder, EXPLAIN ANALYZE runs the query and shows you the execution plan for it. So for this particular query, that's a 50% overhead for running it through EXPLAIN ANALYZE. Then just as a further example, he looked at a virtual instance running on a non-Nitro EC2 instance. And that's important for a reason, I'll come to in a second and you see a huge 700% overhead for this query on that type of instance. A lot of this has to do with the clocks and how they are being used. And he's talking about the Xen virtualized clocks on VMs compared to KVM. So it depends on clocks because when running EXPLAIN ANALYZE, it checks the time. And sometimes it has to do this with every row that is processing. And it's because that he mentions here Postgres, like other OLTP databases, follows a query execution model named the Volcano model, or basically a one-row-at-a-time model. A row goes through each node of the query tree.

So any delays in getting the clock time are going to amplify the impact of EXPLAIN ANALYZE. And he says, of course, the problem with this is that maybe you think that you've optimized the query, but really you haven't because maybe you were focusing on one particular section, but that particular section is significantly impacted by this overhead we're seeing here. He makes this comment right here: "The overhead EXPLAIN ANALYZE introduces is not proportional to the real duration of the query plan, but rather proportional to the number of rows processed by the node". So he says here there's no great way to get around this, but he does have some proposals on something that may make it better for the future.

He says really you want to think of EXPLAIN ANALYZE as a query execution profiler, and like any profiler, you're going to impact the performance of what you're profiling. But towards the future, he would be interested in seeing perhaps some correction mechanism developed where he says here, quote "If the clock time can be measured precisely and the number of times the clock is called is known- Postgres could certainly keep track of it- its contribution could be subtracted from the total measure time". While this wouldn't be exact, it would get a little bit closer. So this is just something to keep in mind when you're using EXPLAIN ANALYZE to profile your queries and analyze what's slow because as he says, EXPLAIN ANALYZE may be lying to you.

The next post is "PARTITION MANAGEMENT - DO YOU REALLY NEED A TOOL FOR THAT?". This is from cybertec-postgresql.com. The post generally talks about usually when someone wants to partition, something looks for a tool that helps them do the partitioning, whereas really it's pretty easy to set up partitions. So there are tools that can do it, but just to do it in PostgreSQL, they just have a basic table and create some indexes. An interesting thing I hadn't seen anyone do here, is he actually created a separate schema for these subpartitions so it doesn't pollute your primary schema. So that's a pretty interesting proposal. In terms of managing the partitions and when he says managing partitions, basically say every month or every year, you need to create a new partition for the new data coming in, or maybe you want to purge old ones.

That's a bit of management that needs to be done on the tables. And this is an example of a CTE that could do this process for you. Here's one for dropping old partitions. Now personally, how I set it, I don't use a tool, but I use my application framework language like Ruby to be able to do SQL calls to do the table management, and I have it running in a cron job and it works just fine. I don't need a separate tool for it. So I definitely agree with him that you don't need separate tools to do it and use whatever language you're comfortable with. Here he's using SQL to do it. You could do PL/SQL to do it and put them behind functions, but I agree you definitely don't need a tool to do partitioning, although it may make some cases easier to manage. So if you're interested in learning more about doing this without using the tool, check out this blog post.

The next post, also from cybertec-postgresql.com, is "SQL TRICKERY: CONFIGURING WINDOWING FUNCTIONS". This is a pretty basic post, but it goes into some different techniques of using window functions to understand what each of the parts means. So I always read through any windowing function posts because I always like to keep up on windowing functions because I actually don't use them that frequently. So this is a brief post about windowing functions. So if you want to learn more about that, definitely check out this post.

The next piece of content is actually a webinar and it's called "Webinar: Tuple Freezing & Transaction Wrap around through Pictures [Follow Up]". This is from 2ndquadrant.com. So this is a webinar. You just click the link here, put in your contact information register and you can look at the webinar. It's about an hour in length and it goes over specifically Tuple Freezing in association with vacuum and handling or avoiding transaction wraparound. They do have a number of videos that were produced as well as pictures that describe how this works in Postgres. So if you want to get more education about that particular part of Postgres, definitely check out this webinar.

The next piece of content is "Don't Leave Me Hanging: Another Type of Transaction to Monitor". This is from Richard Yen at richyen.com and he's talking about transactions that can get hung up are of course the idle in transaction states and there's actually an idle in transaction session timeout you can set to hopefully avoid too many of these. However, something you also need to be aware of is prepared transactions. So this is essentially a two-phase COMMIT. Now, I'll say first that most application developers should not be using this feature.

It's a very, very specific feature when you have multiple databases and you want to do a cross-server commit of a piece of data. So this should be a pretty rare use case. But if you are using them, you do need to be aware of having orphan transactions that could be hanging around, that could cause problems with your database, prevent vacuum from running, causing your transaction IDs to run out, basically a lot of issues. So if you use this feature or are considering it, definitely check out this blog post.

The next post is "A multi-node, elastic, petabyte scale, time-series database on Postgres for free (and more ways we are investing in our community). This is from the timescale.com blog and basically, their multi-node TimescaleDB is being released for free. So you get all of those features being able to run TimescaleDB, a timescale extension for Postgres running across multiple nodes and it will be available for free. Of course, you can also get it on a cloud platform, of course, but it looks like the license ward. It would be an open-source variant of sorts. Now, this post is more of a high-level marketing perspective of communicating it and the benefits, clients that are using it, et cetera. But if you're interested in TimescaleDB, maybe you want to check out this particular post on them to see what they're working on.

The next post is "WAL, LSN and File Names". This is from Luca Ferrari at fluca1978.github.io. He's talking about in the WAL files, there are log sequence numbers that identify what's being recorded in the WAL and that if you know the LSN or the log sequence number, you can actually identify the file that it refers to. Now you can use the command BG current WAL LSN to get what the current sequence number is, as well as the file it refers to. But he goes over the process of explaining how you can break down this LSN to be able to identify the file. Like the first part is the timeline, the middle part refers to this first character, and then the last part refers to these two characters. Then this is the identifier within the file. So if you want to learn more about LSNs and how they work with WAL files, check out this blog post.

The next post, also from fluca1978.github.io, is "Inspecting Command Tags and Events in event triggers". So apparently, event triggers are something that you can set up that will enable you to know if certain objects are created or deleted in your database. So he made a trigger that basically logs output. So for example, when you create a table, it prints out that we created the table here, or if you alter a table and add a column to it, it fires the trigger as well, or if you're creating an index or if you're renaming a table. So this could be potentially useful for auditing. So if you're interested in checking out the pg_event_trigger_ddl_commands, definitely check out this blog post.

The next post is "Upgrading Postgres & Lessons Learned". This is from the Mode Engineering blog on medium.com. They're talking about how they wanted to upgrade from Postgres 9.4 to 10 or higher, ultimately 11.5, and the process they went to. Now originally they wanted to use Bucardo to do it because they wanted to try to avoid downtime, basically do a logical upgrade of sorts. They couldn't use logical replication because that's not available in 9.4, but they wanted to do it in the same kind of fashion. How Bucardo does it through triggers on different tables to pass over the data.

But they had problems with locking scenarios with their application and ultimately they had to abort from using that. What they ended up doing is actually taking several hours of downtime to do the upgrade that's offered by AWS RDS because they have their database in RDS. Now what's interesting is that they mentioned they had talked to some Postgres consultants and that they didn't mention using pg_upgrade because usually with pg_upgrade you have on the order minutes of downtime as opposed to several hours. But usually using pg_upgrade with hardlinks is the route I like to go when doing a Postgres upgrade to minimize the amount of downtime. But if you want to learn more about some of their experience and the issues they encountered, definitely check out this blog post.

The next piece of content is "Partitioning with PostgreSQL v11 (1/2)" and "(2/2)". This is a series of two blog posts on medium.com that talk about partitioning, what it is, how it works, the different types that are available in 11, and the second one talks about doing partitions of partitions. So a multi-tier partitioning scheme as well as attaching and detaching partitions. So if you want to learn more about partitioning, check out these two posts.

The next piece of content, the PostgreSQL person of the week is Paul Ramsey. So if you want to learn more about Paul and his contributions and work with Postgres, definitely check out this blog post.

The next post is "Spatial Constraints with PostGIS - Part 1". This is from crunchydata.com. They're talking about using constraints. These are, say, check constraints or not null constraints or unique constraints as they apply to PostGIS and the data that it stores. So if you want to learn more about that, check out this post.

The next article is "MongoDB Logical Decoding Plugin - First Community Release on GitHub". This is from highgo.ca. So if you're interested in using logical decoding with MongoDB, check out this post.

The last post is actually an announcement of a series of releases for a new connection pooler called pgagroal. It looks to be supported by Red Hat because this is coming from Red Hat Incorporated and its focus is on high performance. So they're trying to make it a better performing connection pooler than others you've heard of, like Pgpool, PgBouncer, et cetera. So if you're looking to get more performance, maybe you want to check out this new connection pooler for Postgres.

Observer Effect, Partition Management, Tuple Freezing, Hung Transactions | Scaling Postgres 116

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript