People get mad when you call LLMs "spicy autocomplete" but my investigations into recreating and implementing small versions of this tech make me think that nick name is very accurate.

futurebird@sauropods.win

The wind up with the bamboozling jargon (you can feel these dudes hoping they put in enough tricky sounding word and concepts to make you just give up) was perfect.

"token prediction"
"vectors"
"gradient descent" (OMG)

The problem is math jargon is my briar patch and tossing me in there is a big mistake.

u0421793@toot.pikopublish.ing

@futurebird@sauropods.win I honestly think (unpopular opinion here) that most of the cost of LLM-based AI thus far is in ‘training’. Not training as in running the phenomenal amount of harvested stolen text and image input through tokenisation processes and reward giving through weight assignment and vector assessment, using more GPUs than exist on Earth, but rather, lots and lots and lots of money paying humans to fake it all and build in patches – patch after patch on top of patch of corrective behaviour, encoded themselves as vector weights. The training had nothing much to do with running it all through GPUs, I believe that probably took an embarassing but totally affordable amount of time and energy. I believe (with no visible means of factual reference to cite) that most of the expenditure of these capital-burning companies was ‘training’ by paying humans and then encoding their resulting guidance. Paying workers.

wakame@tech.lgbt

@bri7 @futurebird

[Not arguging that these models are 'thinking', even if it might sound like that.]

I think the "explain how you arrived at that conclusion" that was all the rage is very interesting for two reasons:

The modell is generating more text. It's not like it is showing you a walk through its model and the random numbers it pulled. So it is basically generating an explanation that is plausible given what the said before.
I think this is often also a behavior with humans. My opinion about a topic might be a gut feeling, but when questioned I start thinking about it, trying to find arguments. Often ones I didn't already have when I stated my opinion.

The first thing could make sense to ask if amodel are trained to change their position given new information. So they could "correct" a bad roll of the dice.
Of course, a user might think that the model "really thought about this", which is obviously not the case.

u0421793@toot.pikopublish.ing

@futurebird@sauropods.win I wasn’t making it up though - the way it works is by tokenising language (not into words but into fragments of words), then assigning the word-derived tokens to vectors (word2vec - it exists), then these vectors are winnowed into the likely winners by gradient descent to find the lowest error (and not get trapped by just falling downhill down the nearest valley) and so on.

technothrasher@universeodon.com

@futurebird @woe2you

"It's one of the reasons I like the fedi. People will say when you are wrong but they are nice about it... mostly"

I find this unfortunately being less and less so as more people discover it. Lately I've seen way more flaming and obnoxious argumentativeness than ever previously. Sigh.

futurebird@sauropods.win

@technothrasher @woe2you

That sucks. Was it someone you knew acting differently or new people showing up? I hardly ever see any real drama so I'm kind of curious in an shallow gossip driven way what was going down...

technothrasher@universeodon.com

@futurebird @woe2you I stick mostly to animal photography on my timeline, which still seems friendly and unaffected. But when looking at trending posts, so seeing things I wouldn't normally see, there tends to be more arguing. Unsurprisingly, it's usually political posts, which are always going to raise emotions, but people used to at least argue constructively. Now it seems to be a lot of yelling and swearing. Not everywhere, but more often.

hypolite@friendica.mrpetovan.com

@maybenot @neckspike @futurebird General Markov 🫡

liiwi@mastodon.social

@futurebird Someone recently used term "Augmenting Intelligence" and I thought it describes much better.

futurebird@sauropods.win

@liiwi

It kind of implies something intelligent rather than probabilistic is going on though.

If I have a hat filled with quotations of wisdom and I pull one out and read it now and then some of the time it will align with what is going on and seem very perceptive.

If I have three hats with such quotes and they are labeled "good" "bad" and "cryptic" and I pick one based on the mood people might think I'm a genius.

raganwald@social.bau-ha.us

@futurebird Very, very similar to magicians and unethical grifters performing cold reads.

@liiwi

djsumdog@djsumdog.com

I call them "Weighted Random Word [or Code] Machines." I have a friend who said he wasn't going to continue the conversation if I was using "slurs." I called him a Cogger Lover.

raymaccarthy@mastodon.ie

@liiwi @futurebird
It (LLM/Generative AI) doesn’t augment intelligence. If anything it conditions people to think less!

liiwi@mastodon.social

@futurebird Godd point, there is also the question that can there be intelligence without identity?

liiwi@mastodon.social

@raymaccarthy @futurebird The context was that it augments user, like a tool.

futurebird@sauropods.win

@liiwi @raymaccarthy

Got it!

raymaccarthy@mastodon.ie

@liiwi @futurebird
It's about the most useless computer tool I've ever seen.
It wastes user's time.

ellie@ellieayla.net

@futurebird maybe "sloppy autocomplete" would be better?

djsumdog@djsumdog.com

The early models did involve hundreds of people who were given multiple generations and clicked on the one that was the least idiotic for eight hours a day, but I think a lot of the newer ones just violate OpenAI/Anthropic's user agreements and use the existing models for reinforced feedback learning. (DeepSeek likely did this). They're likely also using the feedback users give GPT/Claude during use.

So yes, it's going to always have the bias of whatever the rules were printed out for those original employees as well as their own personal biases.

But it's not too over simplistic an explanation. These are next word (token) prediction machines. Each thing that's generated requires the entire context of text to be passed through the machine for each new word. The models themselves are also deterministic, it's just that the generator doesn't always pick the most likely next token. It might randomly select the token at 96% instead of 98% to introduce some variability.

phosphenes@mastodon.social

@liiwi @futurebird

I have always thought there can be intelligence without identity.

A big part of intelligence seems to be about answering the question 'what happens next?' every moment of its existence. Answering this question covers everything from a dropping a ball to Relativity.

NOT saying this is all of intelligence, just one of its major tasks. This part doesn't need to have a 'me' in the model.