@algernon

algernon@lemmy.ml · 5 months ago

It’s not. It just doesn’t get enough hits for that 86k to matter. Fun fact: most AI crawlers hit /robots.txt first, they get served a bee movie script, fail to interpret it, and leave, without crawling further. If I’d let them crawl the entire site, that’d result in about two megabytes of traffic. By serving a 86kb file that doesn’t pass as robots.txt and has no links, I actually save bandwidth. Not on a single request, but by preventing a hundred others.

algernon@lemmy.ml · 5 months ago

I don’t think serving 86 kilobytes to AI crawlers will make any difference in my bandwidth use :)

algernon@lemmy.ml · 5 months ago

That would result in those fediverse servers theoretically requesting 333333 * 114MB = ~38Gigabyte/s.

On the other hand, if the site linked would not serve garbage, and would fit like 1Mb like a normal site, then this would be only ~325mb/s, and while that’s still high, it’s not the end of the world. If it’s a site that actually puts effort into being optimized, and a request fits in ~300kb (still a lot, in my book, for what is essentially a preview, with only tiny parts of the actual content loaded), then we’re looking at 95mb/s.

If said site puts effort into making their previews reasonable, and serve ~30kb, then that’s 9mb/s. It’s 3190 in the Year of Our Lady Discord. A potato can serve that.

algernon@lemmy.ml · 5 months ago

I only serve bloat to AI crawlers.

map $http_user_agent $badagent {
  default     0;
  # list of AI crawler user agents in "~crawler 1" format
}

if ($badagent) {
   rewrite ^ /gpt;
}

location /gpt {
  proxy_pass https://courses.cs.washington.edu/courses/cse163/20wi/files/lectures/L04/bee-movie.txt;
}

…is a wonderful thing to put in my nginx config. (you can try curl -Is -H "User-Agent: GPTBot" https://chronicles.mad-scientist.club/robots.txt | grep content-length: to see it in action ;))

algernon@lemmy.ml · 5 months ago

…and here I am, running a blog that if it gets 15k hits a second, it won’t even bat an eye, and I could run it on a potato. Probably because I don’t serve hundreds of megabytes of garbage to visitors. (The preview image is also controllable iirc, so just, like, set it to something reasonably sized.)

algernon@lemmy.ml · 5 months ago

There’s plenty, but I do not wish to hijack this thread, so… have a look at the Forgejo 7.0 release notes, the PRs it links to along notable features (and a boatload of bugfixes, many of which aren’t in Gitea). Then compare when (and if) similar features or fixes were implemented in Gitea.

The major difference (apart from governance, and on a technical level) between Gitea and Forgejo is that Forgejo cherry picks from Gitea weekly (being a hard fork doesn’t mean all ties are severed, it means that development happens independently). Gitea does not cherry pick from Forgejo. They could, the license permits it, and it even permits sublicensing, so it’s not an obstacle for Gitea Cloud or Gitea EE, either. They just don’t.

algernon@lemmy.ml · 5 months ago

Better NixOS integration, less resources used, similar levels of containment. The containers I planned to use don’t provide any additional safety than the system services. In many cases, I could harden the system services more. Like, if a container has a /bin/bash in it, it’s hard to remove that, while I can pretty easily prevent my systemd service from accessing it.

Like, systemd.services.<name>.confinement is pretty darn strong. If enabled, NixOS will set up a tmpfs-based chroot with just the required runtime store paths for the service. Good luck doing something similar in a container!

algernon@lemmy.ml · 5 months ago

I was in similar shoes (my server is running Debian, as it has been for the past two decades), and am going to rebuild it on something else. I chose NixOS, which I recently switched to on my desktop, because it lets me configure the entire system declaratively, even the containers. The major advantage of a declarative configuration is that it will never be out of date.

My main reason for switching is that I’ve been running the server for a good few years, initially maintained via ansible, but that quickly turned into a hellish bash-in-yaml soup that never quite worked right. So I just made changes directly. And then I forgot why I made a change, or had the same thing copy & pasted all over the place. Today, it’s a colossal mess. With NixOS, I can’t make such a mess, because the entire system is declared in one single place, my configuration.

Like you, I also planned to use containers for most everything, but… I eventually decided not to. There’s basically two things that I will run in a container: Wallabag (because it’s not so well integrated into NixOS at the moment), and my Mastodon instance (which runs glitch-soc, which is considerably easier to deploy via the official containers). The rest will run natively. I’ll be hardening them via systemd’s built-in stuff, which will give me comparable isolation without the overhead of containers. Running things natively helps a lot with declarative configuration too, a nice bonus.

For reference, you can find my (work in progress) server configuration here. It might feel a bit overwhelming at first, because it’s written in a literate programming style using org mode & org roam. I found this structure to work great for me, because my configuration is thoroughly documented, both the whys and hows and whats.

algernon@lemmy.ml · 6 months ago

Ah! My bad.

mumbles something about big corps choosing way too generic names for their stuff

algernon@lemmy.ml · 6 months ago

Threads does not interact with the Fediverse in its current form. It’s a horn blasting into the fediverse at best. It’s not participating in the fediverse, it’s shouting into it. As such, it’s correct to not report on how thredsizens participate in the fediverse - they do not, not at this time.

algernon@lemmy.ml · 6 months ago

I don’t use social media to stay connected with family. I lift up the phone, go visit, or if we need to communicate online, I have an XMPP server for the family with end to end encryption. Can share pictures, text, and can even do video calls if need be, send files, and so on.

Don’t see the need to involve any kind of social media.

algernon@lemmy.ml · 6 months ago

There’s a very easy solution that lets you rest easy that your instance is how you want it to be: don’t do open registration. Vet the people you invite, and job done. If you want to be even safer, don’t post publicly - followers only. If you require follower approval, you can do some basic checks to see that whoever sends a follow request is someone you’re okay interacting with. This works on the microblogging side of the Fediverse quite well, today.

What I’m trying to say is that with registrations requiring admin approval gets you 99% of the way there, without needing anything more complex than that.

algernon@lemmy.ml · 6 months ago

…and you think 14-17 year olds won’t circumvent this in mere seconds? Like, they’d just sign up at an instance that doesn’t implement these labels, or doesn’t care about them, or use their parents accounts, or ask them, or an older friend to sign them up, and so on. Even if age verification would be widespread and legally mandated, I highly doubt any sufficiently determined 14-17 year old would have any trouble getting past it.

algernon@lemmy.ml · 8 months ago

Nevertheless, as Bluesky grows, there are likely to be multiple professionally-run indexers for various purposes. For example, a company that performs sentiment analysis on social media activity about brands could easily create a whole-network index that provides insights to their clients.

(source)

Is that supposed to be a selling point? Because I’d like to stay far, far away from that, thank you very much.

algernon@lemmy.ml · 8 months ago

A story like that, eh? Well, as it turns out, the entire configuration of my operating system is a story. Or rather, many stories.

algernon@lemmy.ml · 8 months ago

I would suggest trying with Wayland + GNOME, using GNOME’s default WM, rather than sway. If that works, and sway doesn’t, then the issue is somewhere between GNOME and Sway. That should help narrow things down.

algernon@lemmy.ml · 8 months ago

And how would that improve anything? Like I said, any general purpose engine is a no-go for me, because they index things I have no desire to ever see in my search results. Kagi is no exception.

Been there, tried it, didn’t find it noticably better than the other general purpose search engines.

algernon@lemmy.ml · 8 months ago

I found that no general purpose search engine will ever serve my needs. Their goal is to index the entire internet (or a very large subset of it), and sadly, a very large part of the internet is garbage I have no desire to see. So I simply stopped using search engines. I have a carefully curated, topical list of links from where I can look up information from, RSS feeds, and those pretty much cover all what I used search for.

Lately, I have been experimenting with YaCy, and fed it my list of links to index. Effectively, I now have a personal search engine. If I come across anything interesting via my RSS feeds, or via the Fediverse, I plug it into YaCy, and now its part of my search library. There’s no junk, no ads, no AI, no spam, and the search result quality is stellar. The downside is, of course, that I have to self-host YaCy, and maintain a good quality index. It takes a lot of effort to start, but once there’s a good index, it works great. So far, I found the effort/benefit ratio to be very much worth it.

I still have a SearxNG instance (which also searches my YaCy instance too, with higher weight than other sources) to fall back to if I need to, but I didn’t need to do that in the past two months, and only two times in the past six.

algernon@lemmy.ml · 8 months ago

Very bad, because the usability of such a scheme would be a nightmare. If you have to unzip the files every time you need a password, that’d be a huge burden. Not to mention that unzipping it all would leave the files there, unprotected, until you delete them again (if you remember deleting them in the first place). If you do leave the plaintext files around, and only encrypt & zip for backing up, that’s worse than just using the plaintext files in the backup too, because it gives you a false sense of security. You want to minimize the amount of time passwords are in the clear.

Just use a password manager like Bitwarden. Simpler, more practical, more secure.

algernon@lemmy.ml · 8 months ago

Works out of the box here, NixOS 23.11 / GNOME 45 / X11, so I suspect this might be either a Wayland or Sway problem.

When I switch to dark mode in GNOME, both Firefox’s UI, and the sites that support dark mode, switch.