Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it toLLMs: (enable that)Free software people: Oh no not like that
-
When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty. Every line of code I write is a copy of another line of code I've read somewhere before, lightly modified to meet my needs. My code is not intended to evoke emotion. It does not change people think about the world. The idea→code pipeline in my head is not obviously distinguishable from the prompt->code process in an LLM
@mjg59 this may be true for code I don't care about or need to deliver quickly, everything else definitely contains as much beauty as I am capable of
-
@mnl @david_chisnall @mjg59 @ignaloidas
even reading the first page.
Generally, this assessment of the overall book extends to each page, even if it contains pages with errors.
For llms, there is a probability that each query is resulting in garbage. In the book-analogy, it is as if each page is written by a different author, some experts, some crooks
Except no page is attributed, and guessing who wrote what page is up to the reader.
There is no model to be build around that fail-mode
2/2@newhinton @david_chisnall @mjg59 @ignaloidas I’m not really following. using an llm doesn’t erase my brain the minute I use it, nor are is it a random number generator where you are forbidden to check the answers? These all hold for llms.
-
Personally I'm not going to literally copy code from a codebase under an incompatible license because that is what the law says, but have I read proprietary code and learned the underlying creative aspect and then written new code that embodies it? Yes! Anyone claiming otherwise is lying!
@mjg59 "i don't like programming and anyone who does is a liar" is a hill to die on, i guess
-
@david_chisnall @mjg59 I suspect CHERI would make running LLM-generated code more feasible, and probably less risky. I'm not saying this to be an annoying contrarian, but rather that stronger underlying models seems to make playing with garbage LLM code more viable. Terry Tao has been using them to generate quick and dirty proofs, cha bu duo.
It certainly can. As long as you are careful about the interfaces to the compartment, you can reason about the worst that can happen with the LLM-generated code. I see this as a special case of supply-chain attacks, which the CHERIoT compartmentalisation mode was designed to protect against: assume this code works for your test vectors and might be actively malicious in other cases, what's the worst that can happen? LLM's just let you bring the supply-chain attacks in house.
-
Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
LLMs: (enable that)
Free software people: Oh no not like that@mjg59 my 2 favourite single user LLM use cases are
for people who are physically immobile, to help them interact with others. Seeing how these tools can make them more able to engage with the world is heartening.
The other is my non tech musician friend who made a simple web page that ensures he plays all his tunes regularly but in random rotation. It hooks into google sheets and he slopped it all up by himself.
-
@david_chisnall @mjg59 @ignaloidas just like humans! Or books!
@mnl @david_chisnall @mjg59 @ignaloidas you don't pick humans nor books, randomly.
-
@zacchiro I understood the ask I replied to was regarding ethical training. Mistral, as an EU company, has to abide by EU regulations AI companies in the US, China etc don't have to.
@troed I see. I don't know either what @chris_evelyn had in mind, so I'll leave it to them. But for what is worth the EU AI Act equally applies to all companies having access to the EU market. Mistral is not be special in that respect, unless the other players decide to leave the EU market (which is unlikely). @mjg59
-
@mnl @david_chisnall @mjg59 @ignaloidas you don't pick humans nor books, randomly.
@ced @david_chisnall @mjg59 @ignaloidas neither does an llm? We are perfectly able to deal with, say, search engine results, which are arguably more problematic than llms. For all intents and purposes, the books and resources I have at my disposal are also the product of random processes. I can still work with them to learn things.
-
When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty. Every line of code I write is a copy of another line of code I've read somewhere before, lightly modified to meet my needs. My code is not intended to evoke emotion. It does not change people think about the world. The idea→code pipeline in my head is not obviously distinguishable from the prompt->code process in an LLM
This is such a bullshit, deprecating framing of what developers do. The fact that you also deprecate yourself doesn't make it any better.
Sure, the individual "line of code" may not be very unique. But the arrangement of many lines is. Your comparison is about equivalent to saying "hah, how can an author produce anything novel if he's just using the same old words from the English alphabet!"
-
Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
LLMs: (enable that)
Free software people: Oh no not like that@mjg59@nondeterministic.computer If you want to use LLMs to make a software what you want, feel free to do it in a private forks. Private forks for yourself are fine. Private is private.
But its also the freedom of the developer/maintainer of the software to not allow such changes upstream or force such changes to be marked. -
Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
LLMs: (enable that)
Free software people: Oh no not like that@mjg59 I have some issues about using LLM, but the only one in the free software world is about license tainting: I’m not sure if the code generated by a LLM is public domain.
-
@ced @david_chisnall @mjg59 @ignaloidas neither does an llm? We are perfectly able to deal with, say, search engine results, which are arguably more problematic than llms. For all intents and purposes, the books and resources I have at my disposal are also the product of random processes. I can still work with them to learn things.
@mnl @david_chisnall @mjg59 @ignaloidas well great for you. *I*'m not able to deal with random search results (especially now that they are often slop). And if your books were bought randomly, sure. Mine were selected because I trust the author, or because I know enough about the author bias to be able to correct it.
-
@mnl @david_chisnall @mjg59 @ignaloidas well great for you. *I*'m not able to deal with random search results (especially now that they are often slop). And if your books were bought randomly, sure. Mine were selected because I trust the author, or because I know enough about the author bias to be able to correct it.
@ced @david_chisnall @mjg59 @ignaloidas do you not use a search engine (genuinely curious, I love building search engines and making them work well)?
Do you think it’s impossible to assign varying degrees of trust to llm output?
-
Look, coders, we are not writers. There's no way to turn "increment this variable" into life changing prose. The creativity exists outside the code. It always has done and it always will do. Let it go.
@mjg59 I think this understanding of art stems from a misunderstanding what art in itself is.
Like of course writing code can be an artistic activity and trying to argue against is just shows a deep misunderstanding of those who see it that way.
But "arts goal" isn't even to be life changing prose, most arts goal isn't even that at all. Most "classical" art was even seen as "just a craft".
"beauty" can manifest in many ways, and self-expression through code is a thing.
-
@newhinton @david_chisnall @mjg59 @ignaloidas I’m not really following. using an llm doesn’t erase my brain the minute I use it, nor are is it a random number generator where you are forbidden to check the answers? These all hold for llms.
@mnl@hachyderm.io @newhinton@troet.cafe @david_chisnall@infosec.exchange @mjg59@nondeterministic.computer the difference is that you can gain trust that some author knows his stuff in a specific field and you no longer need to cross-check every single thing that they write.
With an LLM no such trust can be developed, because fundamentally it's just rolling dice out of a modeled distribution, the fact that the LLM was right about something 9 previous times has no influence whether the next statement will be correct or wrong.
It's these trust relationships that allow to work efficiently - cross checking everything every time is incredibly time consuming. -
@mnl@hachyderm.io @newhinton@troet.cafe @david_chisnall@infosec.exchange @mjg59@nondeterministic.computer the difference is that you can gain trust that some author knows his stuff in a specific field and you no longer need to cross-check every single thing that they write.
With an LLM no such trust can be developed, because fundamentally it's just rolling dice out of a modeled distribution, the fact that the LLM was right about something 9 previous times has no influence whether the next statement will be correct or wrong.
It's these trust relationships that allow to work efficiently - cross checking everything every time is incredibly time consuming.@ignaloidas @mjg59 @david_chisnall @newhinton that’s not how llms work though, it being right 9 times out of 10 very much has an influence on whether the 10th time will be correct. That’s literally how models are trained. There’s an entire research field out there that studies it.
-
Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
LLMs: (enable that)
Free software people: Oh no not like that@mjg59 You will get backlash, but you are right.
Free software folks will have to decide whether what they really wanted was *everyone* to have the freedom to use and modify software, or only that subset of everyone who had the privilege of learning software development.
There has always been this elitist dividing line in the community between people who contribute code, and people who contribute all the other things FOSS needs to thrive. Now those people can contribute code too.
-
@ced @david_chisnall @mjg59 @ignaloidas do you not use a search engine (genuinely curious, I love building search engines and making them work well)?
Do you think it’s impossible to assign varying degrees of trust to llm output?
@mnl @david_chisnall @mjg59 @ignaloidas I do use search engines, but if I don't recognize or can easily get context about the sites listed, it's now nearly impossible to trust the results. It used to be possible (creating content was costly so well written content was usually the mark or someone at least a bit invested on the subject, but in those case I used to cross check several hits) it's not anymore.
LLMs: without knowing the source of the answer, how could it be
It's just plausible. -
Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
LLMs: (enable that)
Free software people: Oh no not like that@mjg59 I'm all about running very outdated software on slightly less outdated Hardware. That's the good stuff bro.
-
@ignaloidas @mjg59 @david_chisnall @newhinton that’s not how llms work though, it being right 9 times out of 10 very much has an influence on whether the 10th time will be correct. That’s literally how models are trained. There’s an entire research field out there that studies it.
@mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe the training objective is not "be correct", so that's not what the models are trained on. They aren't trained on such an objective because there's no way to score it - if you had a system that could determine whether a statement was correct, then you could just use that. No, what the models are trained on are globs of existing text, targeting the continuations to be the same as the text. Notably, most(all?) LLM makers don't even care whether most of the text is "correct" (in any sense sense of the word), and "solve" it by training on some more carefully selected globs of text. And in the end, what the model itself outputs are probabilities of a specific token (not even a sentence or something) to be next. The text you get is all just dice rolls on those probabilities, again and again.
It is a text prediction machine. A very powerful one, but it's just a prediction. It just picks whatever is likely, with no regard with what is correct