Hacker News

How to listen to database changes using Postgres triggers in elixir

by pjullrichon 6/14/2023, 8:24:27 AM with 14 comments

by stevencoronaon 6/14/2023, 10:07:51 AM
I've been using Elixir for the past 5-6 years for my startup. We use pg_notify extensively to broadcast changes between running nodes (basically, use Phoenix.PubSub locally in our apps, with a GenServer to subscribe+re-broadcast using pg_notify).
This has been a really elegant and low-complexity way to get distributed pubsub without the complexity of running a distributed erlang cluster (which seems a lil bit painful in a K8S+Continuous Deploy world)
There -are- some big downsides to be aware of though.
1. You can't use PgBouncer w/ LISTEN/NOTIFY. This has been really painful because of the high memory overhead of a pgsql connection + elixir keeping a pool of open pgsql connections. The tried and true method of scaling here is to just use PgBouncer. We've kicked the can on this by vastly over-provisioning our pg instance, but this has cost $10s of thousands on the cloud. Of course, it's solvable (dedicated non-pgbouncer connection pool just for LISTEN/NOTIFY, for example), but painful to unwind.
2. The payload has a fixed size limit (8KB, IIRC). This has bitten us a few times!
Even though I really like pg_notify, I think that if I were starting over, I'd probably just use Redis Pub/Sub to accomplish the same thing. Tad bit more complex if you're not already running Redis, but without the downsides. (Of course, w/ Redis, you don't get the elegance of firing a notification via a pg trigger)
by bnchrchon 6/14/2023, 3:28:06 PM
This was a fun read!
It reminds me of a very similar post I put out in 2018 https://by.ben.church/Get-notified-of-user-signups-and-plan-...
But I think Peter did a much better job going through the mechanics and providing a more modernized example.
For those that are curious there are pitfalls (that can be worked around)
1. If your DB goes down you may loose messages
2. If you have multiple backends behind a load balancer you may trigger additional events
3. There is a limit to the payload size you can send through these triggers
But for those that want to try this approach I do have a library here that does wraps everything Peter layed out: https://github.com/bnchrch/postgrex_pubsub
Also if you want something even better I recommend WalEx https://github.com/cpursley/walex
Which is based on WAL logs and doesnt have the same limitations.
by noisy_boyon 6/14/2023, 2:46:24 PM
> Postgres offers quick and simple Notifications that can help you react to changes in your database without much overhead. They are particularly interesting if you can’t use Phoenix’s PubSub, for example, if another non-Elixir application also makes changes to your database.
> PERFORM pg_notify('appointments_canceled_changed', payload);
> Be aware that this listener can easily become a bottleneck if you have lots of messages. If you can’t handle the messages quickly enough, the message queue will fill up and crash your application. If you’re worried about this case, you could create one listener per channel or use a PartitionSupervisor to start more handlers and spread out the work.
Why not insert into an events table instead of pg_notify? That way the events are recorded within the database itself, can be processed by any component, the state of processing can be saved in the table so even if the component dies, it can resume (and can even fan out the actual processing to workers). Further, you have the record of all events alongwith the flexibility of interacting with the event information with SQL and with partitioning, you can have a clean way to manage performance + ability to easily archive past/processed events.
by olavggon 6/14/2023, 10:51:27 AM
If you want to listen to database changes, check out Debezium. Instead of triggers it takes advantage of the more recent CDC functionality that most SQL Servers has implemented today. The difference is that triggers works on every transaction while CDC works on a redo log file. This makes it possible to transfer database changes with minimal performance impact.
by wlindleyon 6/14/2023, 1:15:28 PM
Listening to your database — is there a way to create an audio stream so you can actually "listen to" reads (a low hum), writes (tinkling wind chimes), and failed transactions (BANG! CRASH! BOOM!)? That would be a useful ambient sound for a DBA's office, wouldn't it?
by smcameronon 6/14/2023, 10:47:21 AM
I thought this was going to be about using audio to literally listen to the database along the lines of "What different sorting algorithms sound like" https://www.youtube.com/watch?v=t8g-iYGHpEA
by rjbworkon 6/14/2023, 5:32:45 PM
I actually have this exact problem right now with SQL Server on AWS RDS. Unless I want to pay for standard+ editions in my dev/stage/qa/etc. environments, I can't use the baked in CDC features. And because of the minimum instance sizes for Standard+ edition, it costs ~1700 bucks per month per database. This is fine for production, because I need features like High Availability, but paying a significant premium over web/express in those environments seems like lighting money on fire.
We're already tracking changes for the purposes of time travel queries and other auditing purposes using Temporal Tables (SQL:2011 feature). I'm thinking a cron job triggering a lambda every minute should be sufficient to read from the history tables and publish out change data events over a bus.
Anyone see any problems with this approach?
by moojerseyon 6/14/2023, 11:54:21 AM
Debezium is a cool project, though as a fair warning does come with a fair amount of on-going maintenance. And you should prolly be using Kafka and comfortable with the JVM generally.
FWIW, we're (estuary.dev) an open-source and fully managed tool for building CDC pipelines out of Postgres from the WAL.
by RicDanon 6/14/2023, 5:34:44 PM
Pretty cool! not the same but for flyway + java I have a simple github action that on push to develop checks the diff for any *.sql files and mails them to the DA team, as they dont have access to our repo
by SanderNLon 6/14/2023, 6:57:08 PM
I was hoping for some sort of audio stream to listen to my database changes.
High notes for inserts, low rumbles for reads or something. That could be pretty interesting actually.
by aaronmuon 6/14/2023, 1:56:29 PM
This is all fun and games but how do you catch-up after a disconnect? Why choose this over logical replication?
by stefson 6/14/2023, 10:29:52 AM
boo! cows don't have "hooves instead of feet" and they don't lack toes either!
by andreicekon 6/14/2023, 9:45:55 AM
Great article, thank you! Please submit it to https://elixirstatus.com/ as well!