I need people to understand that stuff like this will keep happening, for two reasons:
-
@arichtman because surely there will be no way to prompt-inject a request to write a malicious python script and run it.

-
RE: https://cyberplace.social/@GossiTheDog/116676826944489315
I need people to understand that stuff like this will keep happening, for two reasons:
1. To be useful these chatbots need to have full access to everything they are supposed to "manage"; otherwise they are pointless.
2. Trying to stop prompt injection is basically trying to semantically filter natural language.
These tools have no model of the world, no ontology to anchor any "safety instructions" in. There will always be a way to talk one's way around them.
@rysiek I am sure it passed its unit tests.
-
@rysiek I am sure it passed its unit tests.
@dancast oh yeah, they probably got generated by it, and in a way they always pass.

-
We are several years into this and the biggest companies peddling these tools still cannot figure out how to make their products not fall for advanced cyberattack techniques like *checks notes* asking nicely again.
Microslop Slopilot had (has?) a similar issue – Reprompt attack simply repeated the malicious prompt in a query parameter:
https://www.techrepublic.com/article/news-reprompt-attack-microsoft-copilot/These are not going away.
@rysiek At the bottom of that article is a headline for suggested next article:
“Also read: Microsoft is making Teams secure by default, automatically enabling new protections to reduce AI-driven threats.”It wasn’t secure by default? But they’re gonna change that?
And I love how it flip flops from rock solid certainty “secure by default” to corporate weasel-speak “reduce AI-driven threats” in the span of a single sentence.
-
@rysiek At the bottom of that article is a headline for suggested next article:
“Also read: Microsoft is making Teams secure by default, automatically enabling new protections to reduce AI-driven threats.”It wasn’t secure by default? But they’re gonna change that?
And I love how it flip flops from rock solid certainty “secure by default” to corporate weasel-speak “reduce AI-driven threats” in the span of a single sentence.
@paco Satya Nadella made sure Microsoft focused on security over 2 years ago, after all!
https://www.geekwire.com/2024/haunted-by-repeated-breaches-microsoft-is-putting-security-above-all-else-vows-ceo-satya-nadella/ -
@paco Satya Nadella made sure Microsoft focused on security over 2 years ago, after all!
https://www.geekwire.com/2024/haunted-by-repeated-breaches-microsoft-is-putting-security-above-all-else-vows-ceo-satya-nadella/@rysiek “We are doubling down on this very important work, putting security above all else — before all other features and investments,” Nadella said before adding “at least for the rest of this week. Maybe even a whole month.”

-
@rysiek “We are doubling down on this very important work, putting security above all else — before all other features and investments,” Nadella said before adding “at least for the rest of this week. Maybe even a whole month.”

-
RE: https://cyberplace.social/@GossiTheDog/116676826944489315
I need people to understand that stuff like this will keep happening, for two reasons:
1. To be useful these chatbots need to have full access to everything they are supposed to "manage"; otherwise they are pointless.
2. Trying to stop prompt injection is basically trying to semantically filter natural language.
These tools have no model of the world, no ontology to anchor any "safety instructions" in. There will always be a way to talk one's way around them.
wait. so giving 4 year olds in the playground assault rifles can't ever be made safe? say it isn't so...
-
@paco Satya Nadella made sure Microsoft focused on security over 2 years ago, after all!
https://www.geekwire.com/2024/haunted-by-repeated-breaches-microsoft-is-putting-security-above-all-else-vows-ceo-satya-nadella/ -
One way out of this is compartmentalization, hard-limiting chatbot's access to certain resources. But that defeats the purpose of the chatbot – you can't have a chatbot that manages your mail without giving that chatbot access to your mail...
Another is to move towards more formalized instructions, which can then be properly constrained by permissions etc. But then you're re-inventing programming languages and access control, again defeating the purpose of a natural-language-processing chatbot.
@rysiek I would assume that anything a chatbot has permission to do, will get done, given enough time. Instructions to an LLM are just text which can and will get ignored. Also the chatbot can say that they did something even though no action has taken place.
It's all just meaningless text to the LLM.
-
J jwcph@helvede.net shared this topic