@malte @jzakotnik @gerrymcgovern Correct. The failure mode I observed here -- namely, "spews bullshit when confronted with a question it doesn't have good data for" -- is an absolutely terrible failure mode when someone is trying to use an LLM to answer a factual question. It's *far* worse than simply saying "I don't know" or "I can't answer that". It's even worse when the LLMs deliver bullshit in a pleasantly confident tone, as they consistently do. And "answering factual questions" is what LLMs have been advertised as "good for" ever since ChatGPT 3.5 was released in 2022.

dpnash@c.im
Indlæg
-
Doctors Horrified After Google's Healthcare AI Makes Up a Body Part That Does Not Exist in Humans -
Doctors Horrified After Google's Healthcare AI Makes Up a Body Part That Does Not Exist in Humans@gerrymcgovern One of the first prompts I gave ChatGPT back in 2022 came from my main hobby (amateur astronomy). I asked it to tell me something about the extrasolar planets orbiting the star [[fake star ID]].
[[fake star ID]] was something that anyone who knew how to use Wikipedia half-intelligently could verify was fake within a few minutes.
I wasn't even trying to be deceptive; I genuinely wanted to see how ChatGPT would handle a request for information that I knew couldn't be in its training data.
The torrents of bullshit it produced -- paragraph after paragraph of totally confabulated data about these nonexistent planets orbiting a nonexistent star -- told me everything I needed to know about ChatGPT and its buddies, and I've never been tempted to use them for anything serious since.