Google is cannibalizing the web to feed AI

Sahwa@reddthat.com · 1 month ago

Google is cannibalizing the web to feed AI

BrightCandle@lemmy.world · 1 month ago

The death of Stackoverflow is one of these events where the site has been completely killed by AI and yet its contents is completely necessary for AI to know about solving programming problems. Its death will mark the end of AIs ability to learn how to solve programming issues. Its cannibalizing itself in the process, as it destroys its sources it destroys its own ability to learn.

artyom@piefed.social · 1 month ago

It’s not just that, it’s shitting where it eats. People are using it to fill the internet with disinformation, then it trains itself on it’s own disinformation, and breeds even worse disinformation. This is why AI can never be smarter than it was in 2021.

On top of that, due to the indiscriminate DDOSing of the entire internet by AI bots, websites have been blocking any web crawlers that are not Google, which just contributes to their monopoly.

Drun@lemmy.world · 1 month ago

This is why AI can never be smarter than it was in 2021.

You’re completely wrong. First of all, datasets are getting bigger and objectively better. Secondly, technologies and methods of model training become dramatically more complex. Yes, AI content can create echo for itself, but it’s a solvable issue.

Anyone who disagrees with this fact should do a reality check and learn a little more about the current AI state. In 2021 we didn’t even have reasoning in models.

artyom@piefed.social · 1 month ago

I see someone’s been suckling the OpenAI teet.

Drun@lemmy.world · edit-2 1 month ago

What it have to do with OpenAI? It’s not what I’m talking about.

Nonetheless, it’s kind of fun to see so much downvotes on my comment. Some people are so blinded by their hate (well deserved, by the way) that they will deny something that obvious.

artyom@piefed.social · 1 month ago

You are the only one being blinded by AI marketing and hype.

Drun@lemmy.world · 1 month ago

Ok, lol.

chunes@lemmy.world · 1 month ago

Model collapse isn’t a thing anymore. https://arxiv.org/html/2510.16657v1

Grandwolf319@sh.itjust.works · 1 month ago

Our key finding is that by injecting information through an external synthetic data verifier, whether a human or a better model, synthetic retraining will not cause model collapse.

Yeah if you have a source of truth then your model is basically getting trained on that.

It’s like already having the answer

chunes@lemmy.world · 1 month ago

The point is that it only needs to comprise a very small part of the model.

Grandwolf319@sh.itjust.works · 1 month ago

My point was that having a verifier means your not really training a model on another model’s data, it’s basically as if you get new raw data from a non AI source

CmdrShepard49@sh.itjust.works · 1 month ago

Our key finding is that by injecting information through an external synthetic data verifier, whether a human or a better model, synthetic retraining will not cause model collapse.

Lol, so to make a great model, they just need to have an even better one available first or a human who can verify every single thing it ingests.

Hmm, call me skeptical on this claim.

corsicanguppy@lemmy.ca · 1 month ago

This assumes everything is valid on the external. If one slop cluster feeds off another - a slopveyor? - then there is nothing external for the validation hall-monitor to compare against. They’re trusting another model’s output as if it were gospel.

artyom@piefed.social · 1 month ago

LOL OK

Zarxrax@lemmy.world · 1 month ago

I’m pretty sure AI is objectively smarter today than it was 5 years ago.

SpaceNoodle@lemmy.world · 1 month ago

Since LLMs literally can’t learn, no. They’re just increasingly tweaked to seem even more convincing.

Sharkticon@lemmy.zip · 1 month ago

How can something with no intelligence be smarter?

TeamAssimilation@infosec.pub · 1 month ago

It is evident that it has intelligence, it outputs intelligent responses usually adequate to its input, even if it’s badly phrased. What it doesn’t have is sentience, conscience, and a learning loop.

Sharkticon@lemmy.zip · 1 month ago

Evident to whom exactly?

TeamAssimilation@infosec.pub · 1 month ago

To anyone who interacts with it? Would you deny that a program automate mental labor in the same way that a sawmill automates manual labor? Isn’t that some degree of intelligence?

Now, we have very imperfect LLMs who nevertheless can be instructed not with program code, but actual natural language, and they react accordingly. Isn’t that also intelligence? Computers that understood natural language was the realm of science fiction just five years ago.

I get it that people hate LLMs, both because how idiots use them, and how corporations push them everywhere; but not recognizing the intelligence in those programs is naive at best.

Sharkticon@lemmy.zip · 1 month ago

No, none of that is intelligence. By any definition.

I have deep concerns as to the competence of anyone who interacts with these LLMs and hallucinates that they are intelligent.

oce 🐆@jlai.lu · edit-2 1 month ago

There’s better integration with all sorts of other sources of truth beyond the LLM training, which makes it seem smarter.

MrSmith@lemmy.world · 1 month ago

Objectively + smarter, huh.

urandom@lemmy.world · 1 month ago

It is not alive and cannot really think, so I doubt it’s smarter. It likely contains a bit more knowledge and a better interconnected network for it

Sylvartas@lemmy.dbzer0.com · 1 month ago

Whether said knowledge is of better quality than 5 years ago is up for debate though.

bthest@lemmy.world · edit-2 1 month ago

Actually it just appears smarter because people are objectively dumber than they were 5 years ago. “AI” is actually stagnate.

Strider@lemmy.world · 1 month ago

And yet, they have not created the AI that could do without it. Any day now, promise!

MajorasTerribleFate@lemmy.zip · 1 month ago

There would be a wisdom in AI companies fighting to NOT have their products in use on websites that produce the best handle content for them to eat.