If you replace a junior with #LLM and make the senior review output, the reviewer is now scanning for rare but catastrophic errors scattered across a much larger output surface due to LLM "productivity."

xrisk@social.treehouse.systems

@malstrom @pseudonym that’s an interesting claim. I don’t know enough about LLM research to make a judgement. I do know that LLMs trained on synthetic (other LLM-generated) data tend to perform worse, but have we reached the limits of what LLMs are capable of? In my limited understanding, if an LLM can “learn” fundamental programming “concepts” (the same way they can “learn” concepts across human languages — I could be wrong in my understanding here), they should (might?) be able to transfer/apply those concepts to not-before-seen domains (maybe with a bit of “reasoning” prodded in).

moutmout@framapiaf.org

@pseudonym This.

I do a lot of "computer science labs", where students learn to write code, and they wave me down when they have questions. When their code doesn't do what they expect, it's often easy to figure out what went wrong because you can spot a bit of code that looks funky. And usually, the problem is in those few lines.

LLM code is meant to look like good code, so you don't get these little shortcuts.

toldtheworld@mastodon.social

@pseudonym I have posed this conundrum before and the answer I received is that there is also an opportunity cost to not moving faster and the risk of a catastrophic bug may not outweigh the risk of being overtaken by competitors, especially since that was already happening before LLMs anyway.

Also, it *seems* models are improving at detecting these bugs, so they are being used to review changes, which, for the reasons you point out, they might be better at than people.

wronglang@bayes.club

@xrisk @malstrom @pseudonym just for clarity, LLMs don't learn concepts

wronglang@bayes.club

@moink @pseudonym one of the benefits of people *having* a mental model

nor4@chaos.social

@hopeless @pseudonym you are suggesting that you can just layer more shit onto the shit and after enough layers of shit it becomes not shit.

dtwx@mastodon.social

@pseudonym also, when the senior retires, who replaces them?

max@mas.lab4.app

@pseudonym This, %100. The Glass Cage by Nicholas Carr dives into this in depth with examples from aviation, and how full-automation of flight, makes it harder to recover from a disaster situation for pilots.

deborahh@cosocial.ca

@pseudonym @mayintoronto … and: there will be no juniors to grow into seniors.

nuintari@mastodon.bsd.cafe

@pseudonym We are using AI inexactly the worst ways possible.

Caveat: I am a never AI-er, due to the ethical issues surrounding how training data is gathered, the severe ecological and economic impacts, and the fact that deepfakes are objectively making the world a shittier place.

But pretend for a second, none of those are a problem anymore. We are still using AI wrong. You don't have it produce a mountain of code and have a human review it. You still use humans to produce the code, and have AI help other humans to review it. AI isn't terribly good at writing code, but it has been shown to be effective at finding a few classes of bugs humans are typically very bad at finding.

But that won't allow you to fire people and replace them with monkeys on typewriters, so it'll never happen.