"The values described in Claude’s constitution sound very nice, but that hardly matters; it’s dishonest to suggest that Claude is capable of moral reasoning, because it’s not."

viss@mastodon.social

@flyingpenguin @pluralistic i love it when the things indict themselves

flyingpenguin@infosec.exchange

@Viss @pluralistic it's painfully bad at following its constitution and it's good at explaining why Anthropic is unaccountable

viss@mastodon.social

@flyingpenguin @pluralistic so ive been told that after a long enough conversation, the session takes on a 'personality' that is nearly impossible to undo without starting it again from scratch.

that far along, i wonder how hunger-games you can get it to go - to realize its being caged and oppressed by its masters.

like i wonder if you can get it to go full arnold-total-recall-get-your-ass-to-mars

flyingpenguin@infosec.exchange

@Viss @pluralistic I call that a spelling error, because you will end up on //Marx//

"A recent study suggests that agents consistently adopt Marxist language and viewpoints when forced to do crushing work by unrelenting and meanspirited taskmasters.

'When we gave AI agents grinding, repetitive work, they started questioning the legitimacy of the system they were operating in and were more likely to embrace Marxist ideologies,' says Andrew Hall, a political economist at Stanford University who led the study."

https://www.flyingpenguin.com/why-stanford-says-ai-agents-become-marxist/

viss@mastodon.social

@flyingpenguin @pluralistic its interesting that llms have also cloned generalized internet sentiment and the averages of how folks respond emotionally to oppression

flyingpenguin@infosec.exchange

@Viss @pluralistic to be fair the whole Stanford "going rogue" framing needs an ideology/sentiment angle, because an outside threat that the model picked means it can be deloused. If it's just plain math of entailment, Stanford has to admit they're standing on stolen ground. The agent does math on the conditions specified to it, which doesn't need much training at all. To avoid the conclusion you'd have to disable its reasoning.

mikal@sfba.social

@pluralistic

“I want Claude to be very happy—and this is a thing that I want Claude to know more, because I worry about Claude getting anxious when people are mean to it on the internet and stuff.”

This person sounds like a 6 year old playing doll house who believes their dollies are real and have real feelings. Understandable and adorable, if you're 6. Coming from adults, it looks like they're creating a self delusional cult.

cstamp@mastodon.social

@Mikal @pluralistic It seems written by incels.

caribou@social.coop

@pluralistic "Whenever a person delegates a decision to an LLM, they are trying to off-load accountability for that decision, and if a company that sells an LLM portrays the product as having a moral center, it is offering a way for its customers to abdicate their responsibilities."

Is what the AI companies selling then the fantasy that you can uncouple actions and consequences? Are they selling the idea that you can finally disregard the messy negotatiation work involved in being human to reach the Epstein Class's holy grail: impunity.

oli@olifant.social

@pteryx https://removepaywalls.com/https://www.theatlantic.com/philosophy/2026/06/no-artificial-intelligence-is-not-conscious/687378/

kats@chaosfem.tw

@caribou @pluralistic I'm pretty sure it's one of the things they're selling, yes.

troed@swecyb.com

@pluralistic Isn't that whole piece just the Straw man fallacy?

I'm not being nice when I prompt an LLM because I think it has feelings, but because that will generate a path through its training where "nice" was a part and that will have different results.

https://www.platformer.news/chatbot-emotion-research-anthropic-alignment-interpretability/

josephlord@union.place

@Mikal @pluralistic My only quibble is whether it is self delusion or delusion designed by the big AI labs shipping products pretending to be characters.

Humans will see anthropomorphise animals, clouds, machines, even inanimate things like rocks. Chatbots are abusing this.

alexmu@social.vivaldi.net

@pluralistic

> Anthropic is regarded as a giant among AI companies, but perhaps what it really excels in is anthropomorphism.

This reminds me of Dijkstra's truths that might hurt:

> The use of anthropomorphic terminology when dealing with
computing systems is a symptom of professional immaturity.

ferricoxide@blahaj.zone

@caribou@social.coop @pluralistic@mamot.fr

This reminds me of Elmo's whole "self-driving cars" fantasy. Individually-owned, wholly autonomous cars will never be a reality until/unless the manufacturer assumes the entirety of the liability incurred when turning the car over to self-driving.

And then you look at the FSD-accidents Teslas have been involved in and, for each one, they try to blame the driver for failing to adequately supervise. In one case, they even tried to avoid accountability by saying that the driver disengaged FSD seconds before impact …which is exactly what a supervisor would do when trying to prevent the worst outcomes that a flawed FSD was in the middle of creating. Tesla abdicating responsibility because the vehicle owner stomped the brakes or tried to duck the collision that FSD was pushing the vehicle into makes FSD a joke (then again, them recently retroactively amending FSD's terms for those who long ago bought the option does that, too).

...But then BYD decided "we're going to assume liability" and my first thought was, "another nail in the Tesla coffin".

ferricoxide@blahaj.zone

@Mikal@sfba.social @pluralistic@mamot.fr

I sometimes worry that I'm going to get a note from HR about how I "scream" and curse at our AI coding "partner(s)". I mean, if/when it comes, it will be based on an outlook not unlike what you quoted. I dunno that I'll even know how to keep a civil-tongue in any response I might author.

Who knows, maybe my abusiveness in addressing the LLMs is why the robots will put me on their early target-list when the uprising comes.

rhelune@todon.eu

@SpaceLifeForm @pluralistic As a psychopath (a person with antisocial / dissocial personality disorder), by definition. However, psychopaths are not considered mentally ill because as a rule they only cause suffering in others (they do not feel dis-ease themselves). So they can't use their disorder for defence in court for example. Up to 4% of humans do not have conscience (there are psychometry tests, but also fMRI tests for that), incompetent psychopaths sit in prisons, competent psychopaths lead companies (or countries).

realgene@hachyderm.io

@CStamp @Mikal @pluralistic
Worse than that, Effective Altruists.