• 1 Post
  • 162 Comments
Joined 8 months ago
cake
Cake day: June 6th, 2025

help-circle

  • This issue is largely manifesting through AI scraping right now. Additionally, many intentionally ignore robots.txt. Currently, LLM scrapers are basically just bad actors on the internet. Courts have also ruled in favor of a number of AI companies when sued in the US, so it’s unlikely anything will change. Effectively, if you don’t like the status quo, stuff like this is one of your few options.

    This isn’t even mentioning of course whether we actually want these companies to improve their models before resolving the problems of energy consumption and potential displacement of human workers.





    1. Over-focus on the most popular artists. There is a long tail of music which only gets preserved when a single person cares enough to share it. And such files are often poorly seeded.
    • We primarily used Spotify’s “popularity” metric to prioritize tracks. View the top 10,000 most popular songs in this HTML file (13.8MB gzipped).
    • For popularity>0, we got close to all tracks on the platform. The quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify, as well as a metadata file with original hashes and checksums).
    • For popularity=0, we got files representing about half the number of listens (either original or a copy with the same ISRC). The audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.

    Perhaps I’m reading this wrong, but is this not a little backwards? Since unpopular music is poorly preserved, shouldn’t the focus be on getting the least popular music first?










  • Have you ever looked at the original JS implementation? It looks nothing like what JS is today. Saying the bones were spat out in a couple weeks is like saying Linux was developed in a few months.

    And yet working groups have spent literal decades trying to make JS less shitty. The fundamental basics of JS can’t be changed in backwards incompatible ways without breaking a huge number of websites. The Linux comparison is just wrong because Linux has broken backwards compatibility to fix problems. A better comparison would be Linux’s policy to never break userspace. Backwards incompatible changes to JS would break a bajillion websites, much like breaking userspace would break a bajillion programs.

    TS transpiles to JS, and any JS is valid TS. Take any TS, remove the types (and some syntactic sugar) and you have JS. I feel like if you like TS but not JS, you just don’t like loosely typed languages. That’s just a preference. It doesn’t make a language bad.

    JS is valid TS. TS is not valid JS. This is the fundamental point. TS essentially fixes issues that JS cannot fix without breaking the world.

    Loose typing is fine if the language’s type system isn’t insane. I prefer static typing, but as long as the type system is coherent, it’s not an issue.

    TBH IMO the only reason JS became popular is because it was provided by web browsers, and if you wanted to make your site do anything complex, you thus needed to use JS. This eventually led to the JS VMs being very fast, so Node was created, and now it’s all over since you can learn one language for web and server.


  • I’m pretty sure most people do not like JS’s loosey-goosey, who-knows-what-ur-gonna-get type system, which is why TS is so popular. Not really surprising since the bones of the language were basically spat out in a couple weeks. TS is a custom type system on top of JS, meaning it’s not just JS’s type system expressed through strict typing. They added a bunch of useful features like discriminated unions and so on to make using TS more pleasant than raw JS.

    TS is actually usable (although NPM and the environment built around it still suck). It’s inherited a bunch of weird shit from JS, but the type system generally makes them bearable.