Columnar Storage, Docker Containers, Notify & Listen, Tuning RHEL

In this episode of Scaling Postgres, we discuss Citus 10 columnar storage, using Postgres in docker containers, using notify & listen for real-time apps and tuning RHEL.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "Citus 10 brings columnar compression to Postgres". This is from citusdata.com. They're talking about the newly released open-source Citus 10 and its support of columnar storage. So typically, Postgres uses a heap access method, but Citus has implemented a columnar access method using the Postgres 12 table access API. So to get started, the prerequisites are having Postgres version twelve or later and having the open source Citus version 10 or later, although I'm sure you could use the non-open source version. They're using the psql client. So the first thing you need to do if you need to add Citus to the shared libraries of Postgres and make sure it's enabled and restarted. Then you need to create the extension in any database you want to use.

For their example here, they created a simple row using the standard heap storage, and then a simple columnar table using the columnar storage here. Then they generated a series of 100,000 records in each and then did an average between each of these. So no indexes have been created so far. They're saying here you can see that the table access method is listed as heap for the row and then listed as columnar for the columnar table. Now they have a video that kind of runs through this whole blog post, and I encourage you to also check it out as well. So you can just click here and it brings you to a YouTube video that essentially reiterates the blog post so you can check that out. So in terms of the benefits of doing this, columnar storage is, number one, that it reduces storage requirements because they're using compression to store this data. It takes much less space, and I think we'll see that in an example here.

In addition, because it's compressed, there's less IO needed to actually scan the table. So not as many blocks need to be read in order to read what's in the table. The next benefit is projection pushdown, meaning that queries can skip over the columns they don't need. So if you have a table of, say, 20 columns, and you only need to return or look through two columns, it will only focus on those columns. So that's where columnar storage has huge benefits, you can only access the columns that are of interest. Then the fourth benefit is they use chunk group filtering. So it's a way to skip over groups of data to identify where they are. And they say it can skip past a lot of the data quickly without even decompressing it. So I guess it stores kinds of references of ranges that exist within the data so that it knows what chunks of data to access or not. So they did an example here of using these row-column storage and column storage to look at compression and being able to skip over unneeded rows.

Although I'm not sure how much of a use case this will be because eventually you would want to query against it and would you want to store wide rows in a column-based table? Not really sure how much of a benefit that is. I can definitely see the compression but for a lot of the analytical things that I've looked at, table width has been kept minimal for that reason. So I'm not sure how this is advantageous in the real world. But looking at all of these integers that it's stored, you can see that the compression ratio is 8, meaning that the columnar storage takes up 8x less room than the row-based storage. So that's how good the compression is. Now, when actually querying the data, they're only using three out of the ten columns. So again, you're going to get a big difference in performance out of this. I'm not sure how applicable this is to the real world because why would you put all that data into a table you're hardly querying because it has to be in columnar storage?

So, again, I'm not sure about the use case for this, but you could see with this example that they're getting a 25x speed up because again, you're only accessing the columns that you need to. They're also showing you the number of buffers. Red is over 5 million for the row storage, whereas it's only 27,000 for the columnar storage. So that's a big reason for this speed-up. Again, remember, there are no indexes involved with this. Now let's get into the limitations. Number one, as of right now, there's no UPDATE or DELETE support. So it's best for append-only tables or append-only solutions. Now they said even if they eventually implement UPDATE and DELETE, it's not going to be great performance because it's using column-based storage. Right now there's no index support, logical replication, or logical decoding support. There are a few more limitations they don't mention here, but it's in the README.

Now with that being able to UPDATE and DELETE, they said there is one solution you can use: a hybrid columnar and row table storage using range partitioning. So in the example they used, they created an events table and partitioned it by a date range. For the older partitions, you can use the column-based storage, for the newer partitions you can use row-based storage so that you can do UPDATES and DELETES if you want. But then essentially you can rewrite the row-based storage into a column-based storage once you're moving on to a new partition. So they're saying this kind of gives you the best of both worlds and they're using the special function that they develop called alter_table_set_access_method that's included in the Citus extension to be able to do this.

They give a little bit of some of the query examples and the size improvements you could get by doing this. Now they also mentioned this is based upon a previous extension called the cstore_fdw, but that has a lot of disadvantages compared to the current one. So the old extension, cstore_fdw, did not have transactional integrity. So for example, did not do ROLLBACKS, it did not support the write-ahead logging. You couldn't do physical replication with it. With the Citus columnar storage, you can do physical streaming replication with it. The newer one does support pg_upgrade, which is great news and it's probably because it's based upon that table access method API in Postgres 12. So this is a great addition and if you think that your workload could benefit from column-based storage, maybe you want to check out this extension to see what it can do for you.

The next piece of content is "RUNNING POSTGRES IN DOCKER- WHY AND HOW?". This is from cybertec-postgresql.com. I think this post is based upon the other one that said resistance to containers is futile when they were saying that Postgres is essentially going to be running on containers a lot. This I think has a balanced view of it. They talk about how Docker and containers, in general, were basically built as immutable ways to transport code, but essentially with a database, it is mutable. You have data constantly changing in a database. So is it kind of a square peg in a round hole using containers with databases? Because containers are supposed to be stateless and databases are all about the state. Now there are ways to get around this with containers essentially setting up separate volumes so that the state is stored outside of the container. So one of the first questions the post asks is whether I should use Postgres with Docker or containers in general, and particularly for production workloads.

Their advice is if you do, you should live fully on a container framework such as Kubernetes or OpenShift. Secondly, no, you're going to have to depend on some sort of third-party software projects not affiliated with the PostgreSQL Global Development Group. You may have to maintain your own docker images to include extensions that you need or other scripts to handle upgrading to major versions and things like that. So it's possible, but there will be work involved and I imagine that over time that will continue to improve. But as of right now, it's still early days and you're probably going to have to do a little bit more work for that.

This post talks about a measured approach that is similar to my thinking of 'Yes, you can do it, but you need to be aware of the ramifications of doing it'. But one area they do advocate for using Postgres with containers is in testing. So they say, quote, it's "A testers' dream". So, for example, he says he personally has every version of Postgres as a container he can just spin up to do work in, which is great. Then the post rounds out actually discussing how to actually use docker with Postgres and how he uses it. He gives the different commands you can use, taking a look inside different images and the docker containers he uses, as well as how to set up volumes and some different configurations. So overall the post talks about that yes, you can use containers with Postgres, but quote "...you had better be aware of the pitfalls".

The first point basically talks about how it's meant for an immutable state, so you need a way to handle that. Secondly, containers won't give you any automatic magical high availability capabilities for that. You have to look for container frameworks such as Kubernetes. Third, they say life will be relatively easy only when you go all in on some container management framework like Kubernetes and additionally some operator software such as Zalando and Crunchy Postgres. So basically another post talking about using Postgres with containers and the issues you need to be aware of. I thought this was a very balanced post and I agree with all of their points. So I definitely encourage you to check out this post.

The next piece of content is "Postgres Notify for Real Time Dashboards". This is from arctype.com and they're talking about using the Postgres features, NOTIFY and LISTEN. So the use case that they were designing was being able to develop a real-time dashboard to keep track of production of a manufacturing operation. So they had a backend where employees were inputting the work that they did that was inserted when a particular step was complete into Postgres that would fire a trigger that eventually notified a node.js and WebSockets front end to retrieve more data and present it. So they showed how they actually built this.

So they gave you the base schema of what they developed for tracking the production changes and then they created a function. What that function did was simply perform pg_notify with event type of order_progress_event and then some sort of payload. Now they just went with a simple payload. They didn't want to actually transfer the data as a part of this, although you could do that. They just wanted the clients to receive this and then they would retrieve new data from the database. So basically they were just using it as an event notification system and then you place this function in a trigger on a particular tape. So before the row gets inserted, it would fire this procedure.

Now they went on to the listen syntax and again as I mentioned before, the client would receive this notification, but then they would go do their own query against the database for the data that they needed. So they developed a view for that purpose. So whenever it received a notification that something had been updated, it would just query this view. So this is essentially JavaScript that was used. So at the point that the client connects, it actually runs a query to LISTEN for this event type in Postgres, and then when a notification occurs, it calls this callback called eventCallback, and that event callback queries that view and then present the data to the client. Then they have a little picture of how it presents that data. So this is a very simple great example of using Postgres LISTEN and NOTIFY for real-time notifications. If you're interested in that, check out this blog post.

The next piece of content- "Tuning Red Hat Enterprise Linux Family for PostgreSQL". This is from enterprisedb.com. They're talking about configuring Linux or Red Hat Enterprise Linux for the Postgres use case. In this example, they're using the tuned package to do it, and that enables you to set different things like sysctl, some CPU metrics, as well as the VM. They show some of the configuration changes that they've made specifically for running on Postgres. They explain all the different parameters and how they've determined that those were the ones to set. Then they went into optimizing the file system in terms of doing things like setting no access time and then they closed out talking about Huge Pages and how to determine the best configuration for Huge Pages, how to enable it, and how to make sure that Postgres is using it. So if you're interested in that, you can check out this blog post.

The next piece of content-" Best practices for Amazon RDS for PostgreSQL major upgrades and replicas". This is from aws.amazon.com. They're talking about the things to take into account when you're upgrading your Postgres RDS databases. Basically, you can upgrade essentially in parallel, although it does the primaries in serial first and then does the replicas in parallel. But you can also choose to only upgrade a primary and then create a new replica. With RDS, there are some trade-offs for that. If you choose to do a concurrent upgrade, the upgrade will take longer. So we have longer downtime using this method. In this method, the downtime is shorter to do the upgrade, but depending on the size of your database to create a new replica, you could be running without a replica for quite a while as it takes time to actually rebuild it.

So there's a trade-off you have to consider: the full downtime of the application using the database versus continuing operations without an existing replica. They go through the different processes of how you can do this, as well as give advice on different testing scenarios you can use to make sure that the upgrade works before you actually do the implementation. This also has some good advice that you could apply to Postgres upgrades in general. Now, it works a little bit differently in RDS because as far as I know, they don't use standard PostgreSQL physical streaming replication to do their replicas.

It's more of a disk-based replication to my knowledge. So you can do independent upgrades like this with a replica, but if you're upgrading Postgres not on RDS and using physical streaming replication, there is a different method you can use to upgrade the replicas. So when I've done upgrades for clients, we use pg_upgrade to upgrade the masters, and then a specific technique that's included in the Postgres documentation to upgrade those streaming replicas as well. But definitely check out this blog post if you want to learn more.

The next piece of content is "Speeding up pgbench using COPY FREEZE". This is from pgsqlpgpool.blogspot.com. They're talking about when you're setting up pgbench and you are loading up data for the first time, the area that takes the most time to load the data at the start of the pgbench run is actually the vacuum operation. They talk about why vacuum is so slow. Well, vacuum takes a while because it does a number of different things, such as update the hint bits on each tuple, update the visibility map, update the freeze map, and basically do things to make sure that the table is optimized for doing queries as fast as possible.

But there is an alternative and that is COPY FREEZE. But it only does the first required thing that Vacuum does, but not the others. But according to this post, COPY FREEZE will be enhanced in Postgres 14 to do all of these things. So this post talks about updating pgbench to use COPY FREEZE for this as opposed to vacuum. What they've noticed is a dramatic improvement in the startup time and they imagine this enhancement will actually happen in Postgres 15. But if you use pgbench, definitely some enhancements to look forward to.

The next piece of content is "POSTGRESQL GITHUB ACTIONS- CONTINUOUS INTEGRATION". This is from cybertec-postgresql.com, and they're talking about using GitHub Actions in relation to Postgres. So if you have an interest in learning how to use GitHub Actions, particularly with working with Postgres, definitely check out this blog post.

The last piece of content, the PostgreSQL person of the week is Guillaume Lelarge. So if you're interested in learning more about Guillaume and his contributions to Postgres, definitely check out this blog post.

Columnar Storage, Docker Containers, Notify & Listen, Tuning RHEL | Scaling Postgres 156

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript