I am often wondering if large webservices can run on old-school monolithic relational databases. It would be great because being able to simply model your applications data model as a set of SQL tables with strong constraints about data types and relations has huge benefits. But a single computer can only have that many cores and that much memory, so how far do modern monolithic database systems actually scale? And is Postgres going to be enough for Lemmy?
I stumbled upon the linked article on twitter. And it seems impressive. They are running their two billion dollar IT business on one single managed postgres instance. They started to migrate away from that when they hit the two billion mark in 2020. But instead of choosing a NoSQL system, they chose to stick with Postgres. They split groups of tables into different database instances.
Given that a two billion dollar company can run on a single Postgres instance like that, I have high hopes that no Lemmy instance will ever hit a limit with that.
At the moment it seems though that some parts of lemmy use Postgres in an inefficient way, so there may be some engineering necessary. But in the end I am positive that even an instance with 100k users should be able to run lemmy as it is, with a single postgres instance.
In your comment you tell they preferred maintain Postgres over a NoSQL database. I think SQL and NoSQL database have different scopes. While SQL are the best option for structured data (like I think is the case of Lemmy), NoSQL databases are better for unstructured data. So the best option depends of the context. About the SQL database performance, today different solutions and options exist to deal with this problem despite the amount of data managed: from in memory databases to a rack of front servers working against a RAID cabin of SSD disks, the capability of make table splits by some field like year to speed up the response time, etc
Do you mind sharing a link to that article? Sounds interesting!
I think a lot of this has to do with your table design and general query efficiency. Choosing the wrong structure early on will force you to migrate earlier. If you’re really thoughtful and intentional with your DB design, I can definitely see a single Postgres instance scaling to handle the vast, vast majority of web applications.
The article link is the post :)
Ah! Apologies - still learning the ropes with Lemmy 😅
Yeah databases scale well in my experience, with good indexing and all it can all be handled reasonably well. Also managed serverless scaling is offered by AWS so that’s always cool.