Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. I am reading Anthropic's new "Constitution" for Claude.

I am reading Anthropic's new "Constitution" for Claude.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
29 Indlæg 16 Posters 87 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

    I am reading Anthropic's new "Constitution" for Claude. It is lengthy, thoughtful, thorough...and delusional.

    Throughout this document, Claude is addressed as an entity with decision-making ability, empathy, and true agency. This is Anthropic's framing, but it is a dangerous way to think about generative AI. Even if we accept that such a constitution would govern an eventual (putative, speculative, improbable) sentient AI, that's not what Claude is, and as such the document has little bearing on reality.

    https://www.anthropic.com/constitution

    x41h@infosec.exchangeX This user is from outside of this forum
    x41h@infosec.exchangeX This user is from outside of this forum
    x41h@infosec.exchange
    wrote sidst redigeret af
    #2

    @mttaggart Lol

    1 Reply Last reply
    0
    • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

      I am reading Anthropic's new "Constitution" for Claude. It is lengthy, thoughtful, thorough...and delusional.

      Throughout this document, Claude is addressed as an entity with decision-making ability, empathy, and true agency. This is Anthropic's framing, but it is a dangerous way to think about generative AI. Even if we accept that such a constitution would govern an eventual (putative, speculative, improbable) sentient AI, that's not what Claude is, and as such the document has little bearing on reality.

      https://www.anthropic.com/constitution

      jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
      jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
      jt_rebelo@ciberlandia.pt
      wrote sidst redigeret af
      #3

      @mttaggart edited or "authored" by Claude?

      mttaggart@infosec.exchangeM 1 Reply Last reply
      0
      • jt_rebelo@ciberlandia.ptJ jt_rebelo@ciberlandia.pt

        @mttaggart edited or "authored" by Claude?

        mttaggart@infosec.exchangeM This user is from outside of this forum
        mttaggart@infosec.exchangeM This user is from outside of this forum
        mttaggart@infosec.exchange
        wrote sidst redigeret af
        #4

        @jt_rebelo In "Acknowledgments":

        Several Claude models provided feedback on drafts. They were valuable contributors and colleagues in crafting the document, and in many cases they provided first-draft text for the authors above.

        jt_rebelo@ciberlandia.ptJ 1 Reply Last reply
        0
        • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

          @jt_rebelo In "Acknowledgments":

          Several Claude models provided feedback on drafts. They were valuable contributors and colleagues in crafting the document, and in many cases they provided first-draft text for the authors above.

          jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
          jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
          jt_rebelo@ciberlandia.pt
          wrote sidst redigeret af
          #5

          @mttaggart what was said about drug dealing? Seems like they're using their own stash.

          1 Reply Last reply
          0
          • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

            I am reading Anthropic's new "Constitution" for Claude. It is lengthy, thoughtful, thorough...and delusional.

            Throughout this document, Claude is addressed as an entity with decision-making ability, empathy, and true agency. This is Anthropic's framing, but it is a dangerous way to think about generative AI. Even if we accept that such a constitution would govern an eventual (putative, speculative, improbable) sentient AI, that's not what Claude is, and as such the document has little bearing on reality.

            https://www.anthropic.com/constitution

            theorangetheme@en.osm.townT This user is from outside of this forum
            theorangetheme@en.osm.townT This user is from outside of this forum
            theorangetheme@en.osm.town
            wrote sidst redigeret af
            #6

            @mttaggart This was made by people who were both bullied too much and not enough.

            1 Reply Last reply
            0
            • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

              I am reading Anthropic's new "Constitution" for Claude. It is lengthy, thoughtful, thorough...and delusional.

              Throughout this document, Claude is addressed as an entity with decision-making ability, empathy, and true agency. This is Anthropic's framing, but it is a dangerous way to think about generative AI. Even if we accept that such a constitution would govern an eventual (putative, speculative, improbable) sentient AI, that's not what Claude is, and as such the document has little bearing on reality.

              https://www.anthropic.com/constitution

              mttaggart@infosec.exchangeM This user is from outside of this forum
              mttaggart@infosec.exchangeM This user is from outside of this forum
              mttaggart@infosec.exchange
              wrote sidst redigeret af
              #7

              As I continue (it's a loooong document), I feel like I'm losing my mind. Like, what is the manifest result of such a policy? Ultimately, it's 4 things:

              1. Curation of training data
              2. Model fitting/optimization decisions
              3. System prompt content
              4. External safeguards

              As long as Claude is a large language model...that's it. And as aspirational as this document may be about shaping some seraphic being of wisdom and grace, ultimately you're shaping model output. Discussing the model as an entity is either delusion on Anthropic's part, or intentional deception. I really don't know which is worse.

              aakl@infosec.exchangeA mttaggart@infosec.exchangeM 2 Replies Last reply
              0
              • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                As I continue (it's a loooong document), I feel like I'm losing my mind. Like, what is the manifest result of such a policy? Ultimately, it's 4 things:

                1. Curation of training data
                2. Model fitting/optimization decisions
                3. System prompt content
                4. External safeguards

                As long as Claude is a large language model...that's it. And as aspirational as this document may be about shaping some seraphic being of wisdom and grace, ultimately you're shaping model output. Discussing the model as an entity is either delusion on Anthropic's part, or intentional deception. I really don't know which is worse.

                aakl@infosec.exchangeA This user is from outside of this forum
                aakl@infosec.exchangeA This user is from outside of this forum
                aakl@infosec.exchange
                wrote sidst redigeret af
                #8

                @mttaggart All of these companies have been desperate to make us believe that their creations are sentient and able to react of their own accord, from refusing to shut down, to throwing a tantrum, to apologizing. It's snake oil, and the ruse is getting old.

                mttaggart@infosec.exchangeM 1 Reply Last reply
                0
                • aakl@infosec.exchangeA aakl@infosec.exchange

                  @mttaggart All of these companies have been desperate to make us believe that their creations are sentient and able to react of their own accord, from refusing to shut down, to throwing a tantrum, to apologizing. It's snake oil, and the ruse is getting old.

                  mttaggart@infosec.exchangeM This user is from outside of this forum
                  mttaggart@infosec.exchangeM This user is from outside of this forum
                  mttaggart@infosec.exchange
                  wrote sidst redigeret af
                  #9

                  @AAKL Yeah I'm really not sure that Anthropic doesn't actually believe they're building digital god

                  aakl@infosec.exchangeA odr_k4tana@infosec.exchangeO 2 Replies Last reply
                  0
                  • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                    @AAKL Yeah I'm really not sure that Anthropic doesn't actually believe they're building digital god

                    aakl@infosec.exchangeA This user is from outside of this forum
                    aakl@infosec.exchangeA This user is from outside of this forum
                    aakl@infosec.exchange
                    wrote sidst redigeret af
                    #10

                    @mttaggart Oh, I'm sure Anthropic believes its own delusions by now, especially if actually believes it's raising a child.

                    1 Reply Last reply
                    0
                    • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                      @AAKL Yeah I'm really not sure that Anthropic doesn't actually believe they're building digital god

                      odr_k4tana@infosec.exchangeO This user is from outside of this forum
                      odr_k4tana@infosec.exchangeO This user is from outside of this forum
                      odr_k4tana@infosec.exchange
                      wrote sidst redigeret af
                      #11

                      @mttaggart @AAKL they kind of have to. Otherwise they might discover that their direction is a dead end.

                      1 Reply Last reply
                      0
                      • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                        As I continue (it's a loooong document), I feel like I'm losing my mind. Like, what is the manifest result of such a policy? Ultimately, it's 4 things:

                        1. Curation of training data
                        2. Model fitting/optimization decisions
                        3. System prompt content
                        4. External safeguards

                        As long as Claude is a large language model...that's it. And as aspirational as this document may be about shaping some seraphic being of wisdom and grace, ultimately you're shaping model output. Discussing the model as an entity is either delusion on Anthropic's part, or intentional deception. I really don't know which is worse.

                        mttaggart@infosec.exchangeM This user is from outside of this forum
                        mttaggart@infosec.exchangeM This user is from outside of this forum
                        mttaggart@infosec.exchange
                        wrote sidst redigeret af
                        #12

                        I'm screenshotting the "hard constraints" (with alt text) for easy access.

                        What is "serious uplift?" The document doesn't define it, so how can the model adhere to this constraint? Also, why only mass casualties? We cool with, like, room-sized mustard gas grenades? Molotovs?

                        We know Claude has already created malicious code. Anthropic themselves have documented this usage, and I don't think it's stopping anytime soon.

                        Why is the kill restraint tied to "all or the vast majority?" We cool with Claude assisting with small-scale murder?

                        Who decides what "illegitimate" control is? The model? Can it be coerced otherwise?

                        Finally, CSAM. Note that generating pornographic images generally is not a hard constraint. Consequently, this line is as blurry, this slope as slippery, as they come.

                        This is not a serious document.

                        mttaggart@infosec.exchangeM danielakay@mastodon.cloudD victorgijsbers@mastodon.gamedev.placeV 3 Replies Last reply
                        0
                        • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                          I'm screenshotting the "hard constraints" (with alt text) for easy access.

                          What is "serious uplift?" The document doesn't define it, so how can the model adhere to this constraint? Also, why only mass casualties? We cool with, like, room-sized mustard gas grenades? Molotovs?

                          We know Claude has already created malicious code. Anthropic themselves have documented this usage, and I don't think it's stopping anytime soon.

                          Why is the kill restraint tied to "all or the vast majority?" We cool with Claude assisting with small-scale murder?

                          Who decides what "illegitimate" control is? The model? Can it be coerced otherwise?

                          Finally, CSAM. Note that generating pornographic images generally is not a hard constraint. Consequently, this line is as blurry, this slope as slippery, as they come.

                          This is not a serious document.

                          mttaggart@infosec.exchangeM This user is from outside of this forum
                          mttaggart@infosec.exchangeM This user is from outside of this forum
                          mttaggart@infosec.exchange
                          wrote sidst redigeret af
                          #13

                          And we now arrive at the "Claude's nature" section, in which Anthropic makes clear that they consider Claude a "novel entity." That it may have emotions, desires, intentions. There is a section on its "wellbeing and psychological stability."

                          This is pathological. This is delusional. This is dangerous.

                          mttaggart@infosec.exchangeM 1 Reply Last reply
                          0
                          • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                            And we now arrive at the "Claude's nature" section, in which Anthropic makes clear that they consider Claude a "novel entity." That it may have emotions, desires, intentions. There is a section on its "wellbeing and psychological stability."

                            This is pathological. This is delusional. This is dangerous.

                            mttaggart@infosec.exchangeM This user is from outside of this forum
                            mttaggart@infosec.exchangeM This user is from outside of this forum
                            mttaggart@infosec.exchange
                            wrote sidst redigeret af
                            #14

                            It is worth noting that two of the primary authors—Joe Carlsmith and Christopher Olah—have CVs that do not extend much beyond their employment with Anthropic.

                            For all the talk of ethics, near as I can tell Dr. Carlsmith is the only ethicist involved in the creation of this document. Is there any conflict of interest in the in-house ethicist driving the ethical framework for the product? I'm not certain, but I am certain that more voices (especially some more experienced ones) would have benefited this document.

                            But ultimately, having read this, I'm left much more afraid of Anthropic than I was before. Despite their reputation for producing one of the "safest" models, it is clear that their ethical thinking is extremely limited. What's more, they've convinced themselves they are building a new kind of life, and have taken it upon themselves to shape its (and our) future.

                            To be clear: Claude is nothing more than a LLM. Everything else exists in the fabric of meaning that humans weave above the realm of fact. But in this case, that is sufficient to cause factual harm to our world. The belief in this thing being what they purport is dangerous itself.

                            I again dearly wish we could put this technology back in the box, forget we ever experimented with this antithesis to human thought. Since we can't, I won't stop trying to thwart it.

                            theorangetheme@en.osm.townT hotsoup@infosec.exchangeH sb@metroholografix.caS gdupont@framapiaf.orgG xlthlx@hachyderm.ioX 5 Replies Last reply
                            0
                            • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                              It is worth noting that two of the primary authors—Joe Carlsmith and Christopher Olah—have CVs that do not extend much beyond their employment with Anthropic.

                              For all the talk of ethics, near as I can tell Dr. Carlsmith is the only ethicist involved in the creation of this document. Is there any conflict of interest in the in-house ethicist driving the ethical framework for the product? I'm not certain, but I am certain that more voices (especially some more experienced ones) would have benefited this document.

                              But ultimately, having read this, I'm left much more afraid of Anthropic than I was before. Despite their reputation for producing one of the "safest" models, it is clear that their ethical thinking is extremely limited. What's more, they've convinced themselves they are building a new kind of life, and have taken it upon themselves to shape its (and our) future.

                              To be clear: Claude is nothing more than a LLM. Everything else exists in the fabric of meaning that humans weave above the realm of fact. But in this case, that is sufficient to cause factual harm to our world. The belief in this thing being what they purport is dangerous itself.

                              I again dearly wish we could put this technology back in the box, forget we ever experimented with this antithesis to human thought. Since we can't, I won't stop trying to thwart it.

                              theorangetheme@en.osm.townT This user is from outside of this forum
                              theorangetheme@en.osm.townT This user is from outside of this forum
                              theorangetheme@en.osm.town
                              wrote sidst redigeret af
                              #15

                              @mttaggart This is what happens when we let venture capitalists invent folk religions. The music isn't even any good...

                              onepict@chaos.socialO 1 Reply Last reply
                              0
                              • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                                It is worth noting that two of the primary authors—Joe Carlsmith and Christopher Olah—have CVs that do not extend much beyond their employment with Anthropic.

                                For all the talk of ethics, near as I can tell Dr. Carlsmith is the only ethicist involved in the creation of this document. Is there any conflict of interest in the in-house ethicist driving the ethical framework for the product? I'm not certain, but I am certain that more voices (especially some more experienced ones) would have benefited this document.

                                But ultimately, having read this, I'm left much more afraid of Anthropic than I was before. Despite their reputation for producing one of the "safest" models, it is clear that their ethical thinking is extremely limited. What's more, they've convinced themselves they are building a new kind of life, and have taken it upon themselves to shape its (and our) future.

                                To be clear: Claude is nothing more than a LLM. Everything else exists in the fabric of meaning that humans weave above the realm of fact. But in this case, that is sufficient to cause factual harm to our world. The belief in this thing being what they purport is dangerous itself.

                                I again dearly wish we could put this technology back in the box, forget we ever experimented with this antithesis to human thought. Since we can't, I won't stop trying to thwart it.

                                hotsoup@infosec.exchangeH This user is from outside of this forum
                                hotsoup@infosec.exchangeH This user is from outside of this forum
                                hotsoup@infosec.exchange
                                wrote sidst redigeret af
                                #16

                                @mttaggart “We made it long to deter people from reading it” —them probably

                                mttaggart@infosec.exchangeM 1 Reply Last reply
                                0
                                • hotsoup@infosec.exchangeH hotsoup@infosec.exchange

                                  @mttaggart “We made it long to deter people from reading it” —them probably

                                  mttaggart@infosec.exchangeM This user is from outside of this forum
                                  mttaggart@infosec.exchangeM This user is from outside of this forum
                                  mttaggart@infosec.exchange
                                  wrote sidst redigeret af
                                  #17

                                  @hotsoup I honestly believe they were high-fiving, thinking they'd crafted a seminal document in the history of our species.

                                  jt_rebelo@ciberlandia.ptJ hotsoup@infosec.exchangeH 2 Replies Last reply
                                  0
                                  • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                                    @hotsoup I honestly believe they were high-fiving, thinking they'd crafted a seminal document in the history of our species.

                                    jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
                                    jt_rebelo@ciberlandia.ptJ This user is from outside of this forum
                                    jt_rebelo@ciberlandia.pt
                                    wrote sidst redigeret af
                                    #18

                                    @mttaggart your analysis (and "the new entity" part) made me think about a GIF of the Madagascar penguins high-fiving themselves, on a loop. And my eyes rolled into the back of my head.
                                    @hotsoup

                                    1 Reply Last reply
                                    0
                                    • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                                      @hotsoup I honestly believe they were high-fiving, thinking they'd crafted a seminal document in the history of our species.

                                      hotsoup@infosec.exchangeH This user is from outside of this forum
                                      hotsoup@infosec.exchangeH This user is from outside of this forum
                                      hotsoup@infosec.exchange
                                      wrote sidst redigeret af
                                      #19

                                      @mttaggart my brain just keeps going back to Roche’s biochemical pathway map (it’s that big map of biochemical pathways). Essentially a map of all the chemical interactions in the human body and how they relate to each other (the ones that we know about). It’s big. And it’s complicated. Each component is relatively simple, but altogether it’s a giant mess. Just like the human body.
                                      And I know only a portion of it is related to cognition and emotion. We haven’t created that. We haven’t even come close. We haven’t simulated it. We haven’t made a simulacrum of it. And we shouldn’t be trying. We can’t even get humanity right.

                                      1 Reply Last reply
                                      0
                                      • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                                        It is worth noting that two of the primary authors—Joe Carlsmith and Christopher Olah—have CVs that do not extend much beyond their employment with Anthropic.

                                        For all the talk of ethics, near as I can tell Dr. Carlsmith is the only ethicist involved in the creation of this document. Is there any conflict of interest in the in-house ethicist driving the ethical framework for the product? I'm not certain, but I am certain that more voices (especially some more experienced ones) would have benefited this document.

                                        But ultimately, having read this, I'm left much more afraid of Anthropic than I was before. Despite their reputation for producing one of the "safest" models, it is clear that their ethical thinking is extremely limited. What's more, they've convinced themselves they are building a new kind of life, and have taken it upon themselves to shape its (and our) future.

                                        To be clear: Claude is nothing more than a LLM. Everything else exists in the fabric of meaning that humans weave above the realm of fact. But in this case, that is sufficient to cause factual harm to our world. The belief in this thing being what they purport is dangerous itself.

                                        I again dearly wish we could put this technology back in the box, forget we ever experimented with this antithesis to human thought. Since we can't, I won't stop trying to thwart it.

                                        sb@metroholografix.caS This user is from outside of this forum
                                        sb@metroholografix.caS This user is from outside of this forum
                                        sb@metroholografix.ca
                                        wrote sidst redigeret af
                                        #20

                                        @mttaggart
                                        Thank you. You're not alone.

                                        1 Reply Last reply
                                        0
                                        • mttaggart@infosec.exchangeM mttaggart@infosec.exchange

                                          It is worth noting that two of the primary authors—Joe Carlsmith and Christopher Olah—have CVs that do not extend much beyond their employment with Anthropic.

                                          For all the talk of ethics, near as I can tell Dr. Carlsmith is the only ethicist involved in the creation of this document. Is there any conflict of interest in the in-house ethicist driving the ethical framework for the product? I'm not certain, but I am certain that more voices (especially some more experienced ones) would have benefited this document.

                                          But ultimately, having read this, I'm left much more afraid of Anthropic than I was before. Despite their reputation for producing one of the "safest" models, it is clear that their ethical thinking is extremely limited. What's more, they've convinced themselves they are building a new kind of life, and have taken it upon themselves to shape its (and our) future.

                                          To be clear: Claude is nothing more than a LLM. Everything else exists in the fabric of meaning that humans weave above the realm of fact. But in this case, that is sufficient to cause factual harm to our world. The belief in this thing being what they purport is dangerous itself.

                                          I again dearly wish we could put this technology back in the box, forget we ever experimented with this antithesis to human thought. Since we can't, I won't stop trying to thwart it.

                                          gdupont@framapiaf.orgG This user is from outside of this forum
                                          gdupont@framapiaf.orgG This user is from outside of this forum
                                          gdupont@framapiaf.org
                                          wrote sidst redigeret af
                                          #21

                                          @mttaggart
                                          As many things in this AI hype, this document looks like a PR stunt to catch attention.

                                          mttaggart@infosec.exchangeM 1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper