all the criticism has been said, all the takes been had.

fluffy@plush.city

@jonny ... and why the everloving FUCK do these tests run as root

dahukanna@mastodon.social

@jonny Referencing
1. @shauna post based on @DGI about power dynamics & dysfunction between imaginary labour(iML) & interpretive labour(iNL)-https://www.rethinkingpower.info/how-interpretive-labor-straddles-the-gap-between-rules-and-reality/
2. Power, chapter 4 of Mary Parker Follet’s Dynamic administration - https://mastodon.social/@dahukanna/110643444784446704

Presuming Productivity(P)=(iML/iNL)
dysfunctional power-over tool imposition e.g. LLM, factory production,etc
- Imagined abstract: 1 LLM PR/0 review units= ∞P
- Interpreted reality: 1 LLM PR/>10 review units=0.1P
-https://mastodon.social/@dahukanna/113230734549577353

technocrow@blahaj.zone

@fluffy@plush.city @jonny@neuromatch.social running tests as root is fucking wild

sesamzoo@mastodon.social · finkhaeuser.de/2026-04-10-outs

@jens, great article, thank you. Did you pull the lever "just one more time" and if so, did it get even worse?

@jonny, thank you for this thread and lots of your other threads on the topic.

Both help feeling that I'm not the ghost driver although these days there is lot of contraflow on my lane. Mostly at work where the AI fanboys/believers/addicts are at least way louder than the people trying to understand and keeping their code in maintainable shape.

jens@social.finkhaeuser.de

@sesamzoo @jonny No, I did not. I try to use LLMs not at all, so I really did maybe one or two more queries more than described, just to get a feel for it.

jens@social.finkhaeuser.de

@sesamzoo @jonny Also, I feel dirty every time, so I don't want to waste water crying in the shower afterwards.

jetsetilly@mastodon.gamedev.place

@themipper @jonny
> It is a shame to see more and more foundational projects fall into the LLM trap

The one that breaks my heart is vim.

henryk@chaos.social

@jonny (Un)charitable interpretation: it smoke tests whether the ensure_ffmpeg function is syntactically correct — which is a failure mode LLMs are actually concerned about.

jonny@neuromatch.social

@fluffy
Apparently all the tests for rsync are integration tests across bash rsync calls

robinadams@mathstodon.xyz

@ainmosni @KalenXI @jonny With the caveat that the ethical problems with AI mean it's absolutely not worth the cost:

It looks like AIs are actually getting better at finding bugs and security holes.

So do your usual testing and code reviews, and then ask Claude to find any bugs you may have missed. It will give you some false positives but also some true ones.

Very different from having an LLM generate the code and a human try to fix it up.

Like Cory Doctorow's example: using an AI to give a second opinion on MRI scans (a centaur) makes scans more expensive but higher quality.

Having an AI analyse the scans at high speed and then getting some poor schmuck to try to spot its mistakes (reverse centaur) makes scans cheaper and lower quality, but at least there's a person with little power in the hierarchy who gets the blame for the problems.

Guess which one the people pouring trillions into AI want?

sesamzoo@mastodon.social

@jens @jonny At work "everyone has to use AI" according to C level people, despite all examples of things going wrong or at best "only" wasting lots of resources. Full hype mode.

LLMs are a tool. I just haven't figured out what for they are a reliable and reasonable choice.

For "How could problem X be addressed?", f.i. with a language I know very well, it might generate an answer with good points for me to look up and verify by myself in detail. Like a shortcut for a bunch of search queries.

bms48@mastodon.social

@jonny Ugh. I didn't realize at first which project this was. Then I looked at the repo. Yes, the road to hell IS paved with good intentions...

pandabutter@plush.city

@jonny It's a pattern I've been noticing all over.
Step 1: a process is created to measure something, like "does the software work right?" or "who do people want to be president?"
Step 2: There's an incentive for the people who perform and maintain the process to get a certain outcome, like good performance reviews or the guy you like being elected.
Step 3: Lacking the power to alter the thing being measured, the people in charge get creative with how they measure.

davey_cakes@mastodon.ie

@ainmosni @jonny

I'm kinda similar, the most I tried gambling was when I went to a gambling town for a wedding. I came out up but it just didn't do much for me.

Except poker, maybe because that involved people. That was fun.

pandabutter@plush.city

@jonny It's those darn management cybernetics all over again. In order to make good software, we built a system of writing and passing tests—and (say it with me) the purpose of a system is what it does. The starting assumption that "did we make good software" would always be a critical test turned out to be false, because no system was constructed to keep it true.

themipper@mastodon.social

@JetSetIlly @jonny damn.

bms48@mastodon.social

@jonny Reminded of the "similarity quagmire" by changes Netflix guys are staging for FreeBSD TCP logs. My engineering notes from Feb, they did 80-90% of what I had in mind to do. The LLMs kept confabulating and conflating the meaning of "delayed_ack" with delacktime. "delayed_ack" is a variant ; its meaning is context dependent for macOS or between BSDs. it is neither a timer delta, nor is it an outstanding segment threshold counter; it can be all of those things. The LLMs did not infer this.

bms48@mastodon.social

@jonny The really bizarre thing is, that even if LLMs were capable of actual reasoning, they'd be considered to be committing category error in how they parse source code; logical fallacies then follow from that. Deep learning proponents also appear to have been doing this. The terminology describing LLMs and deep learning invites category error and cognitive dissonance. Geoffrey Hinton has a lot to answer for with his "It might be conscious" emergence-based woo.

synlogic4242@social.vivaldi.net

@jonny nailed it

jonny@neuromatch.social

@henryk
The smoke test for syntactic correctness is, thankfully, the interpreter.