Inserts, Nearest Neighbor, Freezing, Comment | Scaling Postgres 34
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we review articles covering all about inserts, how to handle nearest neighbor, freezing confirmation and Comment.
Content Discussed
- Are your foreign keys indexed?
- PostgreSQL Tips: Documenting the Database
- PostgreSQL 11 RC1 Released!
- PostgreSQL: The Versatile INSERT
- Won’t You Be My Neighbor? Quickly Finding Who is Nearby
- Managing Freezing in PostgreSQL
- Freezing Your Tuples Off, Part 1
- Freezing Your Tuples Off, Part 2
- PostgreSQL Documentation: Preventing Transaction ID Wraparound Failures
- PG Phriday: Studying Stored Procedures in Postgres 11
- Dynamic Monitoring of PostgreSQL Instances Using pg_top
- PostgreSQL Extensions for an Enterprise-Grade System
- Postgres Open SV 2018 YouTube Channel
YouTube Video
Podcast Audio
Transcript
[00:00:00] In this episode of Scaling Postgres, we talk all about inserts nearest neighborfreezing and comments. I'm Kristen Jameson, and this is scaling postgres episode 34.
[00:00:20] Alright, our first piece of content is actually a notice that postgres QL eleven releasecandidate one is released. So not too many changes since beta four, mostly bug fixes. Sobasically our release date is approaching quickly. And again, as mentioned in last week'sepisode, october 18 appears to be the release date for PostgreSQL eleven.
[00:00:43] The next piece of content is PostgreSQL the versatile insert. And this is from PG IO.And basically they cover different ways you can do insert other than your typical doing single rowinserts. So they talk about copying in bulk and in this case they're talking about where you'recreating a table from an existing table and maybe appending some more data to it thatessentially inserts a lot of data into this new table. And then they also mentioned you can do aninsert select where you're inserting into a particular table and selecting from one or more othertables in your database. The next area they talk about is Up Certs. And basically this is using theon conflict clause. So if you're doing an insert you can actually say on a conflict of a particularvalue, you can tell it to do nothing, which is not really an upset, it's not updating anything, but ona conflict you can do nothing, or on a conflict you can do an update which is an upset. And theygo into more complex use cases of where Up Certs can be used. The next area they talk aboutis returning generated values. So you can do an insert into a table and actually using thereturning clause, return a given piece of data like a key. So I see this frequently used if you havea sequence, maybe you want to return the primary key or that that sequence generated for yourtable when you do an insert. And then a little more sophisticated use case is moving rows withCTE clauses. So you can create a common table expression that does a delete from one table,returning the values from that row and then inserting the data into the new table. So this blogpost goes over some of the different ways you could insert data other than your typical insertinga single row of data.
[00:02:32] The next post is Won't you be my neighbor? Quickly finding who is nearby. And this isfrom the Crunchydata.com blog and basically this is referring to the kneeest neighborcalculation.
[00:02:45] So first thing they did is they built a visits table where someone is visiting a coffeeshop, they have a visitor UUID a timestamp that they visited at and a geocode column to recordthe longitude and latitude point. And you can calculate distances between these two points usingthis distanced operator. And the smaller the number, the closer the two points are to each other.So if you wanted to find the three closest individuals, you would select from the visits table andput a particular time range and order it from a particular location to what's geocoded in thedatabase and limit it to three so it'll pull the three nearest ones in this time frame that are nearbyin this coffee shop. Now, they mentioned they do have, I believe, a Beatri index in place thatthey discussed in a previous article, but even with that previous index that it's in place, noindexes were used according to the explain plan for doing this particular query. So basically forfinding nearest neighbors using an index, you should use a just index. So they applied a Gistindex over the geocode column of their table and when they run it not using the time restrictionbut only the point, it does use the index and returns relatively quickly in less than onemillisecond. However, when they added that time range back in, it still used that just index, butwe don't have anything indexed on the visited that to be able to use within this query. But whatthey mentioned is that there is a Btree Gist extension that you can add to your database. So thatallows you to create a multicolumn index on visitedat and geocode. And now with this query, itexecutes very far under one millisecond. So if you have this particular type of problem, maybeyou want to check out the Btree Gist extension for use in your application or database.
[00:04:45] The next article is managing freezing in PostgreSQL. So this blog post talks abouthow PostgreSQL basically manages its MultiVersion concurrency control. And basically as thedata --base is actively working, it's constantly moving a transactional ID with each statement thatgets executed. And that transaction ID is able to see 2 billion active transactions back and 2billion transactions ahead of it. Now, before a transaction hits being 2 billion old, it needs to befrozen. And frozen means there's no active transaction that needs to see it anymore. Andbasically Vacuum or auto vacuum handles that process of freezing. So this goes over thisconcept in detail and tells you about some different parameters you can adjust on how to alterthe freezing behavior. So it talks about vacuum freeze, min age, vacuum freeze, table age, andAuto vacuum Max freeze age, as well as some multitransaction configuration options as well. Soif you're not familiar with this, this is definitely a good piece of content to check out to help youreally configure your database well. But I found three other pieces of content that would behelpful for understanding this post as well. The first is freezing your tuples off. Part One and thisis from the database soup blog. Now this is from 2012, but it's still relevant. And actually I foundit easier to read this blog post first in order to help me understand the blog post that just cameout and then there's also part two, Freezing Your Tuples Off. Part two again from the DatabaseSoup blog and he covers some additional configuration options. And then of course, if you wantto go to the source, there's the PostgreSQL documentation and specifically the area calledPreventing Transaction ID wraparound Failures. So it goes over vacuum and auto vacuum andthe process of freezing and how that works. So these are a set of content that can help youbetter understand how freezing works in PostgreSQL and how you might be able to alter itsconfiguration to suit your use case better.
[00:06:57] The next post is are your foreign keys indexed? And this is from theCybertechgresql.com blog. So basically typically you're going to want to index your foreign keysbecause you're typically going to have queries that are going to be searching by those foreignkeys. But another significant reason to have indexes on your foreign keys is when you deleterows or update key columns in your target table, PostgreSQL has to ensure that that foreign keyconstraint is still satisfied that you're not going to end up with orphaned records. So for example,you don't want to delete an order when you have a bunch of order items that are still connectedto it. So definitely a blog post to check out to help you better index your foreign keys in yourdatabase.
[00:07:40] The next post is PostgreSQL tips documenting the database. So this post is basicallytalking about a command called Comment that I actually really wasn't that familiar with. So fordifferent database objects in your database, it allows you to set a comment by doing commenton the database object, giving it its name, and then say Is, and then put in a comment ordescription. And then when you go to describe a particular object, it will show information in thedescription column here. Now, I'm not sure I'm necessarily going to be using in some of mydatabase systems because some of this information is stored elsewhere. But probably once youhave very large number of users using the database and you wanted to use this as the source oftruth for describing something, definitely a feature to check out and a blog post that goes throughhow to use it.
[00:08:31] The next post is PG. Friday studying stored procedures in postgres eleven. And thisis from the second quadrant.com blog. And basically this goes over of course the storedprocedures in PostgreSQL, but this is kind of a light hearted take on things. So someone postedon Twitter this way, where you could waste transaction IDs using procedures basically loop overand just keep using up transaction IDs, which could cause a world of hurt if you read themanaging freezing in PostgreSQL blog post, and then it goes into, oh, well, you know, you coulddo this more efficiently by just assigning this to a variable. And now you can waste transactionIDs four times faster. And then also each subheading is actually from a song lyric I felt so goodlike anything was possible. I hit cruise control and rubbed my eyes. So bonus points if you knowthe artist and the song that this is referring to. Now he goes in and tries different things likecalling PG sleep when running this procedure and looking at date ti --mes and the effect when running it through the procedure. And all of this was kind of to beable to explore how procedures are going to be a little bit different than functions. So if you'relooking forward to the procedures that are coming in PostgreSQL Eleven, definitely a blog postto check out.
[00:09:49] The next post is dynamic monitoring of PostgreSQL instances using PG top. And thisis from the several nines.com blog. So Top is a way to look at the top processes that are goingon in the system. Look at CPU usage, memory usage. While Pgtop is dedicated for PostgreSQLand the output looks like this, you get a load average the number of PostgreSQL processes thatare running sleeping, CPU states, memory usage, database activity, IO disk, and the differentprocesses of PostgreSQL. So if you're not familiar with Pgtop and want to potentially use it,definitely a blog post to check out the next post is PostgreSQL Extensions for an EnterpriseGrade System. Now, what I liked about this is that it basically lists some very popularPostgreSQL extensions and I'm using a number of these. But for your use case, I would say thisis a good post to check out to say what else is available, what else are people using, or whatextensions are they adding to their database system for what use cases. So I would highlysuggest just checking out this blog post to look at some of the extensions that they aresuggesting.
[00:11:02] And the last piece of content is actually a YouTube channel. It's a follow up tosomething that was mentioned last week where they had actually posted five sessions. Well,they now have posted another five sessions this week. Now again. This is The Postgres. OpenSB 2018 YouTube Channel. So definitely a lot of different content to check out.
[00:11:23] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingpostgres.com where you can signup to receive weekly notifications of each episode. Or you could subscribe via YouTube oritunes. Thanks Sam. --