I expect lemmy.ml to be hugged to death, but that’s just a DoS.
More interestingly would be how the activitypub network reacts under reddit-like loads and behaviour, and as i understand it, things shouldnt be too bad?
Is lemmy scalable?
I expect lemmy.ml to be hugged to death, but that’s just a DoS.
More interestingly would be how the activitypub network reacts under reddit-like loads and behaviour, and as i understand it, things shouldnt be too bad?
Is lemmy scalable?
Lemmy and Lemmy-ui are pretty much stateless and thus horizontally scalable. Pictrs and Postgres take a bit more work.
I’m not a dev though. Just my own experience hosting my own instance.
Recent versions of
pict-rs
support object storage. I haven’t tried, and that doesn’t inherently mean thatpict-rs
scales horizontally, but I’m hopeful that multiple instances could connect to the same blob storage safely? Then one could use minio or just S3 or whatever which do scale horizontally (or cloudily).Also, I read somewhere that Lemmy uses an older version of
pict-rs
and that some modest effort is needed to upgrade and get access to the blob storage feature.I’ve tested running multiple instances of pict-rs using distributed storage (Moosefs) for the uploaded file directory. I ran into several weird issues. Mainly that images would not load, or end up broken. But then refresh the page and hit a different pict-rs instance and it worked for some images, but broke for others. So now I run a single pict-rs instance, still on distributed storage, but everything works. This is on the 3x branch. I believe the 4x branch is still in rc stage.
Well that’s a bummer.
At the very least, a Lemmy install is comprised of at least 4 discrete processes. I have to think that putting lemmy, lemmy-ui, pict-rs (backed by object-storage to provide scaleable I/O and let the pict-rs host focus on CPU), and PG (optionally with additional read-replicas) on separate boxes would result in a hardware platform that outscales the current codebase for a year or two while they clean up perf issues that crop up with large user/post/comment/community counts.
That’s basically what I’m doing. A singleton instance for Pict-rs and Postgres. Multiple compute nodes for Lemmy and Lemmy-ui being accessed over load balancers. All of data is placed on distributed storage so I can quickly fail-over Pict-rs or Postgres if needed. It’s enough for now, though I don’t think it could handle Reddit levels of traffic.