Squeeze Your System, One Million Connections, Indexing LIKE, pgvector HNSW | Scaling Postgres 278
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss how to squeeze the most out of your database, achieving one million connections to Postgres, how to use indexes with LIKE and pgvector HNSW index performance.
Content Discussed
- Squeeze the hell out of the system you have
- Hacker News - Squeeze your system discussion
- Supavisor: Scaling Postgres to 1 Million Connections
- Indexing “LIKE” in PostgreSQL and Oracle
- An early look at HNSW performance with pgvector
- Sharding
- Securing your PostgreSQL DB with Roles & Privileges
- Fun with PostgreSQL Puzzles: Recursive Functions with Animations
- How We Reduced Downtime by 100x: Supplementing pgBackrest With EBS Snapshots
- How an AWS Aurora feature cut our DB costs by 90%
- Episode 6 highlights of Path To Citus Con podcast w/Chelsea Dole & Floor Drees
- Balancing Act: Achieving High Availability with HAProxy for PostgreSQL
- New Module Initialization vs Bootstrap in PostgreSQL
- pdot: Exploring Databases Visually, Part II
- Getting started with MobilityDB
- PlanetScale vs. Neon: the Continued Saga between MySQL and PostgreSQL
- Supabase is now HIPAA and SOC2 Type 2 compliant
YouTube Video
Podcast Audio
Transcript
All right. I hope you, your friends, family, and coworkers continue to do well. Well, I'm planning to make some changes to the Scaling Postgres episodes as they come out. So I'm thinking about potentially doing interviews with people in the community. Or people who have written some of this content to have a more in-depth discussion. As well as reducing the number of posts that I cover each week, maybe no more than 10. I mean I'll still post all the links for the content I found but have a smaller set to focus on.
But I would really like to get your feedback. So if you have suggestions on how I can improve this show, please put it in the comments or reply to the email that gets sent out with this Scaling Postgres email list. I'd really like to get your feedback. But for this week the first piece of content is "Squeeze the hell out of the system you have". This is from blog.danslimmon.com. And he's talking about a situation that I've heard more than once where their database was struggling to keep up with the load.
And the CP was riding between 60 and 80%, and at least once it spiked to 100%, causing a brief outage. So not good. And they'd been kicking the can down the road, just increasing and increasing the database size and they're at that point where they ask 'What do we do next?'. So they considered looking into sharding and doing micro-services, so splitting up the monolith into multiple services. But as the title of the post indicates, they decided to squeeze the hell out of their system. Because the reality is that these two options increased complexity by a lot.
And he advocates here, quote "Always squeezed first". So they say they put two engineers focused on performance enhancements for about three months. They said quote "There was no silver bullet". But it looks like they just went a piece at a time, adjusting queries or statements to tune it as well as tuning a lot of Postgres settings. And two more engineers redirected queries to a replica DB. So the primary wasn't burdened with all the re-traffic. And this reduced the maximum CPU usage from 90% to 30%. So I love the message that this sends. Basically, a lot of times increasing the size of the database will give you a sub 100% increase in performance.
Maybe with a configuration setting change, maybe you're going to double or triple some performance. But changing how you use it, what indexes are there, how you're querying it, how you're using it in terms of the schema, what inserts and updates you're doing, and how much you've optimized it can make a 10,000x difference. Cause multiple times I've found indexes that increase the query by 10,000 fold. And talking about this specifically, I've had client engagements where the CPU was hitting 80-90%. And just by going through the top queries of the database and optimizing them, we got it down to maybe an average of 25%.
So usually there's always a better way to optimize your system. And I don't normally look at hacker news, but this was posted on hacker news and I found two comments particularly interesting. One person reports essentially the same thing, where they were at 2000 requests per second. And apparently, I guess most of the developers and the PM thought 'Okay, that's all we can do'. And he started working on it. He got it to over 3,000 RPS and then 4,000 RPS. 10,000 RPS, 40,000 RPS and then changing a whole bunch of stuff. He says he got it to 2 million and a month later, 40 million requests per second.
So this is anecdotal but it's definitely possible to drastically increase your performance by changing how you're using the database. And not just getting bigger hardware or partitioning or sharding or things of that nature. The other quote I found interesting is quote "Loads of over-engineering decisions would be avoided if devs understood how to read EXPLAIN/ANALYZE...". Now I don't necessarily know about that but I know learning how to read EXPLAIN/ANALYZE helps you get these thousandfold improvements and performance. But definitely an interesting blog post I suggest you check out.
Next piece of content- "Supavisor: Scaling Postgres to 1 Million Connections". This is from supabase.com. And this is another connection pooler like PgBouncer and the also new PgCat. But this as they say here is a quote "... scalable cloud-native Postgres connection pooler. It's been developed with multi-tenancy in mind, handling millions of connections without significant overhead or latency". So this is a test where they basically put 1 million connections through Supavisor and connected it to 400 backend database connections.
So at first, they started with 250,000 connections and the database performance they could get is up to 20,000 queries per second. And this shows the connections increasing and it scales in pretty much a linear fashion. Then they increased it to 500,000 connections, again going through the same 400 connections to the database. And again at 20,000 queries per second, it's just increased the number of connections. And this also seemed to scale in a linear fashion. And they also showed the load being spread out across the different cores. Then as you can see, all of them are being utilized. And of course, one of the main disadvantages of PgBouncer is that it's single-threaded.
So if you wanted to have a dedicated instance for PgBouncer and this many CPU cores, basically you have to have this many PgBouncer services running. It takes more engineering to get it running across multiple cores as they show here. And then they up it to a million connections and how they do that is that they just add another Supavisor node and send 500,000 connections through the Supavisor node that's maintaining the database connections to the Postgres server. So this node handles 500 but also relays another 500. So that just happens to be how it's architected.
And they say the setup successfully handled over a million simultaneous connections to the Supavisor instances. That then had 400 connections to the database, but it achieved 20,000 queries per second. And each connection executed the select query once every 50 seconds. And what they said here is that "In a multi-tenant setup (or when using Read-Replicas), the load is more evenly spaced..." because each Supavisor connects to a given database, but then other Supavisor can relay connections to them if they receive the connection from the client. So that's an interesting design perspective that I think is different from PgCat.
Now there was an impact on query duration comparing Supavisor vs. PgBouncer. And they show here that for their tests, queries with Supavisor had an additional two millisecond delay. So that's something to take into account. This has a lot of advantages but this was one disadvantage that they did mention. But I love seeing these new connection poolers. Particularly ones that allow you to use all the cores of the system. So if you're interested in that, check out this blog post.
Next piece of content- "INDEXING 'LIKE' IN POSTGRESQL AND ORACLE". This is from cybertec-postgresql.com. And normally if you're going to query a text column and you have a b-tree index on it, using LIKE for a query is not going to use that index. So, how do you do that? And they actually take it a step further in terms of showing what different coalitions can do. So actually, you use the C collation, which is not something I've done, for prefix searches because those are the only kinds you can do. You can't really search inside of an index the way a b-tree index is laid out.
It's essentially like a book index where you can book for the first letter and then the second letter, the third letter, et cetera. But with a pre-fixed search, it actually uses the index. But the C coalition has a disadvantage when you try to order the actual data that they have in here. And it orders it in unexpected ways. But then they took a look at ICU correlations and that actually ordered things better. But it's not going to use an index when you do a search. So to actually use the index for a prefixed LIKE search, you need to add the text pattern ops or their care pattern ops so that an index can be used. Because what this does is it actually does character-wise comparison operators. So they matched character by character to find the correct entry in the index.
Now they compare it to Oracle. And even though Oracle has some advantages, they have some big disadvantages that they list down here and I'll let you read the post to understand. What he's basically saying is that from the Oracle options, you either get a "...correct result for ORDER_BY and a wrong result for LIKE..." using a particular coalition or you get "...a wrong result for ORDER_BY and a correct result for LIKE..." using a different coalition. And he wasn't able to find a way to have an index support that was able to do a query that tries to get you both like you can in Postgres. But if you want to learn more, check out this blog post.
Next piece of content- "AN EARLY LOOK AT HNSW PERFORMANCE WITH PGVECTOR". This is from jkatz05.com. They're talking about pgvector, which is an extension that allows you to store vector data in Postgres. It is used for a lot of artificial intelligence or machine learning systems and storing this vector data allows you to basically do nearest-neighbor searches to find comparisons between different things or find associations between things. The current version of the pgvector includes the ability to index using ivfflat. However, in the 0.5 release that's coming up, they've added support for HNSW as well.
Which stands for Hierarchical Navigable Small Worlds. And the benefit of this is in exchange for a longer build time, and potentially larger indexes, you still get to have high performances, maybe not as high as ivfflat. But much more accurate recall which he says quote "where 'recall' is the percentage of relevant results returned". So if you want more relevant results, that would have a high recall. Now he goes into a lot of depth about the test that he did. And he emphasized that this is one point in time from one commit of pgvector in one version of all the different software to put this together.
So take these results with a grain of salt, but it gives an indication of how this new index could be beneficial. So he is using an approximate nearest neighbor search benchmark for Python. And he's run that and it shows results similar to this. So the y-axis is queries per second, so how fast it can do the query. And then recall at the bottom, how relevant the search results were. And you can see the pgvector HNSW, the dark blue. It seems very accurate or has a super high recall. It may not be as fast as pgvector, but it's a little hard to discern what point matches up with each point in each index.
I'm assuming these are different steps of the benchmark but I'm not quite sure what each top means. But HNSW gives you very relevant results in the search. And still pretty quick compared to previous versions in the red here. And he used different data sites in this test. Further results show basically the same thing- greater speed for the ivfflat, but recall is not that high. So if you're using pgvector for doing AI-type activities, maybe you want to check out the new version coming out of pgvector to see if you want to try this new HNSW implementation.
Next piece of content. There was another episode of Postgres FM last week. This one was on "Sharding". And they discuss quote "what it means, why and when it's needed, and the available options right now". So you can click to listen to the episode here or watch the YouTube video down here.
Next piece of content- "Securing your PostgreSQL DB with Roles and Privileges". This is from rlopzc.com.
(And there's a rainstorm in the background. Sorry if you're hearing that, but I can't really stop the weather.)
So this post starts at a very basic level and builds up to explain to someone who's new to using roles and Postgres on how to set up a role-based system within Postgres. Because previously they were using the Postgres user, changing the password, and that's what everyone had access to. So essentially, no role infrastructure is set up. But here they explain what are roles, what are privileges, and how they decided to set things up to architect it to reach their SOC2 compliance. So if you want to look at how they set up their system to achieve their SOC compliance for permissions and Postgres, definitely check out this blog post.
Next piece of content- "Fun with PostgreSQL Puzzles: Recursive Functions with Animations". This is from crunchydata.com. So this is actually day 19 of the 2022 Admin of Code Challenge. If you want to learn how to do that, check out this blog post.
Next piece of content- "How We Reduced Downtime by 100x: Supplementing pgBackrest With EBS Snapshots". This is front timescale.com. And here they're talking about using EBS snapshots along with pgBackrest to do some of their backups and potentially restores. So they were talking about 100x so what they were doing is that in the event of an upgrade, they will do a full pgBackrest backup after that upgrade so that moving forward, they could use pgBackrest to do the restore.
But then they had a lot of downtime because apparently, they didn't bring the system up until that full backup was done. So what they started doing instead is taking an EBS snapshot. And that happens very quickly. So they did the EBS snapshot, brought the system up, and then they started the backup that would be completed later. So they're still using pgBackrest and they said they don't plan to get rid of it but they're just adding snapshots in certain areas to help give them additional benefits.
And I've actually worked with a client who has chosen to use EBS snapshots as their main backup solution. Now you can't do point in time recovery doing that or it would require some additional work to do that. But they've been happy using the snapshot solution. So if you want to learn more about how and why they've done this, you can check out this blog post.
Next piece of content- "How an AWS Aurora feature cut our DB costs by 90%". This is from graphite.dev. And they're talking about how they have a very high Aurora bill, not necessarily because of the amount of data that they store but because of their super high IO. So they have a lot of IO happening in their application. But Aurora made some changes that actually adjusted how IO-intensive applications were costing and they were offering up to 40% cost savings. But for this particular application, their cost savings were 90%. So if you use Aurora and it's mostly IO-heavy usage, maybe you want to check this out as well.
Next piece of content- "Episode 6 highlights of Path To Citus Con podcast w/Chelsea Dole & Floor Drees''. This is from citusdata.com. So this is a blog post for the episode that they did talking about quote "You're probably already using Postgres: What you need to know”. So this is mostly for developers. So if you're interested in that, you can check out their podcast.
And the last piece of content- "Balancing Act" Achieving High Availability with HAProxy for PostgreSQL". This is from highgo.ca. And this blog post does what explains- shows you how to set up HAProxy, to have a primary and two replicants. So if you're interested in that, you can check out this blog post.