Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Is there a way to build #consent/legal terms into the #ActivityPub protocol or is it already there?

Is there a way to build #consent/legal terms into the #ActivityPub protocol or is it already there?

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
consentactivitypub
14 Indlæg 2 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • benjaoming@social.data.coopB This user is from outside of this forum
    benjaoming@social.data.coopB This user is from outside of this forum
    benjaoming@social.data.coop
    wrote sidst redigeret af
    #1

    Is there a way to build #consent/legal terms into the #ActivityPub protocol or is it already there?

    Specifically, I want to make it clear that data isn't allowed to be used for personal profiling nor training AI. That it would be a violation.

    madsenandersc@social.vivaldi.netM 1 Reply Last reply
    1
    0
    • benjaoming@social.data.coopB benjaoming@social.data.coop

      Is there a way to build #consent/legal terms into the #ActivityPub protocol or is it already there?

      Specifically, I want to make it clear that data isn't allowed to be used for personal profiling nor training AI. That it would be a violation.

      madsenandersc@social.vivaldi.netM This user is from outside of this forum
      madsenandersc@social.vivaldi.netM This user is from outside of this forum
      madsenandersc@social.vivaldi.net
      wrote sidst redigeret af
      #2

      @benjaoming

      I'm not an expert on this in any way, but I think building a legal consent into a data transfer protocol might be difficult, especially if said protocol is encrypted.

      You basically could not verify what is being transferred without decrypting the data stream, and in essence perform i man-in-the-middle attack.

      I'm pretty sure that the legal framework has to be built on top of the application, not the protocol.

      benjaoming@social.data.coopB 1 Reply Last reply
      0
      • reynir@social.data.coopR reynir@social.data.coop shared this topic
      • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

        @benjaoming

        I'm not an expert on this in any way, but I think building a legal consent into a data transfer protocol might be difficult, especially if said protocol is encrypted.

        You basically could not verify what is being transferred without decrypting the data stream, and in essence perform i man-in-the-middle attack.

        I'm pretty sure that the legal framework has to be built on top of the application, not the protocol.

        benjaoming@social.data.coopB This user is from outside of this forum
        benjaoming@social.data.coopB This user is from outside of this forum
        benjaoming@social.data.coop
        wrote sidst redigeret af
        #3

        @madsenandersc all clients on the Fediverse should hopefully use the ActivityPub protocol, and if a client happens to be a bot or AI scraper, I would expect them to read the field that tells them they can't do LLM training and/or personal profiling. Hopefully we can afford enforcement to tell if an LLM model suddenly contains data that was labeled to not allow this.

        Even better if data protection laws like GDPR provided this protection by default, though.

        madsenandersc@social.vivaldi.netM 1 Reply Last reply
        0
        • benjaoming@social.data.coopB benjaoming@social.data.coop

          @madsenandersc all clients on the Fediverse should hopefully use the ActivityPub protocol, and if a client happens to be a bot or AI scraper, I would expect them to read the field that tells them they can't do LLM training and/or personal profiling. Hopefully we can afford enforcement to tell if an LLM model suddenly contains data that was labeled to not allow this.

          Even better if data protection laws like GDPR provided this protection by default, though.

          madsenandersc@social.vivaldi.netM This user is from outside of this forum
          madsenandersc@social.vivaldi.netM This user is from outside of this forum
          madsenandersc@social.vivaldi.net
          wrote sidst redigeret af
          #4

          @benjaoming

          Again, it's a protocol, not an application. Think about it: It is not the protocol that exposes the content to the AI scraper - it is the application.

          Spam is not prevented by legal restrictions of the SMTP-protocol either - it's a feature in the application instead, and it is usually activated by the user (or at least possible to deactivate).

          I understand what you want, and I get the reasons for it, but restricting what protocols can carry and for whom, is usually a bad idea in the long run.

          There is a always a use case that you did not think of, and suddenly people starts forking or modifying new versions of the protocol, with backwards compatibility going down the drain.

          benjaoming@social.data.coopB 1 Reply Last reply
          0
          • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

            @benjaoming

            Again, it's a protocol, not an application. Think about it: It is not the protocol that exposes the content to the AI scraper - it is the application.

            Spam is not prevented by legal restrictions of the SMTP-protocol either - it's a feature in the application instead, and it is usually activated by the user (or at least possible to deactivate).

            I understand what you want, and I get the reasons for it, but restricting what protocols can carry and for whom, is usually a bad idea in the long run.

            There is a always a use case that you did not think of, and suddenly people starts forking or modifying new versions of the protocol, with backwards compatibility going down the drain.

            benjaoming@social.data.coopB This user is from outside of this forum
            benjaoming@social.data.coopB This user is from outside of this forum
            benjaoming@social.data.coop
            wrote sidst redigeret af
            #5

            @madsenandersc I can also write a scraper that doesn't care about robots.txt. People would then discover that the scraper acts in ways we don't like and try to block it.

            I think it's the same, if you exercise the implications of what it would mean if ActivityPub had a distinct field for how the data is allowed to be used. Whether that means legal action or blocking certain things.

            benjaoming@social.data.coopB madsenandersc@social.vivaldi.netM 2 Replies Last reply
            0
            • benjaoming@social.data.coopB benjaoming@social.data.coop

              @madsenandersc I can also write a scraper that doesn't care about robots.txt. People would then discover that the scraper acts in ways we don't like and try to block it.

              I think it's the same, if you exercise the implications of what it would mean if ActivityPub had a distinct field for how the data is allowed to be used. Whether that means legal action or blocking certain things.

              benjaoming@social.data.coopB This user is from outside of this forum
              benjaoming@social.data.coopB This user is from outside of this forum
              benjaoming@social.data.coop
              wrote sidst redigeret af
              #6

              @madsenandersc It seems like we're just missing something because we don't even have a basic way to say "no, my post isn't allowed for LLM training"

              There's also the good old HTTP field "Do Not Track".

              benjaoming@social.data.coopB 1 Reply Last reply
              0
              • benjaoming@social.data.coopB benjaoming@social.data.coop

                @madsenandersc It seems like we're just missing something because we don't even have a basic way to say "no, my post isn't allowed for LLM training"

                There's also the good old HTTP field "Do Not Track".

                benjaoming@social.data.coopB This user is from outside of this forum
                benjaoming@social.data.coopB This user is from outside of this forum
                benjaoming@social.data.coop
                wrote sidst redigeret af
                #7

                @madsenandersc If we just conclude that AI scrapers are bad actors and won't respect anything, there isn't much to do... but if at least, we put up a boundary, then they can't claim we didn't say so? Otherwise, it's just a free pass at scraping the Fediverse and using the data for whatever purpose...

                madsenandersc@social.vivaldi.netM 1 Reply Last reply
                0
                • benjaoming@social.data.coopB benjaoming@social.data.coop

                  @madsenandersc I can also write a scraper that doesn't care about robots.txt. People would then discover that the scraper acts in ways we don't like and try to block it.

                  I think it's the same, if you exercise the implications of what it would mean if ActivityPub had a distinct field for how the data is allowed to be used. Whether that means legal action or blocking certain things.

                  madsenandersc@social.vivaldi.netM This user is from outside of this forum
                  madsenandersc@social.vivaldi.netM This user is from outside of this forum
                  madsenandersc@social.vivaldi.net
                  wrote sidst redigeret af
                  #8

                  @benjaoming

                  I don't follow the reasoning behind your example here?

                  If the field in the ActivityPub protocol cannot be enforced (like the entry in robots.txt), why bother then? It will just be like the "Do not track"-field in your browser settings that will give those that don't know any better a false sense of security. Or do I misunderstand your example?

                  I still firmly believe that this problem can only be solved reliably at the application level, not the protocol.

                  benjaoming@social.data.coopB 1 Reply Last reply
                  0
                  • benjaoming@social.data.coopB benjaoming@social.data.coop

                    @madsenandersc If we just conclude that AI scrapers are bad actors and won't respect anything, there isn't much to do... but if at least, we put up a boundary, then they can't claim we didn't say so? Otherwise, it's just a free pass at scraping the Fediverse and using the data for whatever purpose...

                    madsenandersc@social.vivaldi.netM This user is from outside of this forum
                    madsenandersc@social.vivaldi.netM This user is from outside of this forum
                    madsenandersc@social.vivaldi.net
                    wrote sidst redigeret af madsenandersc@social.vivaldi.net
                    #9

                    @benjaoming

                    I'll be the devils advocate here. If the protocol forbids scraping, who is to blame in a legal fight if scraping occur? The application provider? Because the scraper never touched anything related to the protocol and never saw if the flag was set on the post.

                    Is the flag to be set at all times? Can you prove that the content was transported by the ActivityPub protocol, and that the scaper definitely was aware of that? Because if you can't, legal action is not possible anyway.

                    This is not about what is right and what is desireable, it's about what is possible to prove in a courtroom and who you can blame for any misdeed. I am pretty sure that a good lawyer will be able to shift the blame to the application displaying the content to the scraper, or at least make the case that the scraper had no way of knowing that this particular content was off-limit.

                    Yes, you can implement an extension of the protocol with the field you talk about, but it will be of no use at all, unless you get all possible Fediverse clients to agree to that field and implement precautions against scraping if the user has set the field. If the scraper is successful anyway, the Fediverse client will be the one to blame for not implementing the precauitions well enough.

                    benjaoming@social.data.coopB 1 Reply Last reply
                    0
                    • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

                      @benjaoming

                      I don't follow the reasoning behind your example here?

                      If the field in the ActivityPub protocol cannot be enforced (like the entry in robots.txt), why bother then? It will just be like the "Do not track"-field in your browser settings that will give those that don't know any better a false sense of security. Or do I misunderstand your example?

                      I still firmly believe that this problem can only be solved reliably at the application level, not the protocol.

                      benjaoming@social.data.coopB This user is from outside of this forum
                      benjaoming@social.data.coopB This user is from outside of this forum
                      benjaoming@social.data.coop
                      wrote sidst redigeret af
                      #10

                      @madsenandersc I think there's a genuine chance to tell when a post shows up in an LLM model and it had a field that said "don't train LLMs with this post".

                      The rest of the enforcement work is about figuring out if the scraper could have seen that field and chose to ignore it.

                      Even if posts can be exposed through applications that filter out the field, it could be possible to prove that an LLM (or ad profiling) has been using data from the source, hence had access to the field.

                      benjaoming@social.data.coopB 1 Reply Last reply
                      0
                      • benjaoming@social.data.coopB benjaoming@social.data.coop

                        @madsenandersc I think there's a genuine chance to tell when a post shows up in an LLM model and it had a field that said "don't train LLMs with this post".

                        The rest of the enforcement work is about figuring out if the scraper could have seen that field and chose to ignore it.

                        Even if posts can be exposed through applications that filter out the field, it could be possible to prove that an LLM (or ad profiling) has been using data from the source, hence had access to the field.

                        benjaoming@social.data.coopB This user is from outside of this forum
                        benjaoming@social.data.coopB This user is from outside of this forum
                        benjaoming@social.data.coop
                        wrote sidst redigeret af
                        #11

                        @madsenandersc The other part of my reasoning is to say: If we don't declare anything, then we've just lost from the beginning. We SHOULD find a way to declare the consent of data sharing wrt. the ActivityPub protocol.

                        Otherwise we have NOTHING. Like literally nothing? Do you know of any kind of existing restriction?

                        madsenandersc@social.vivaldi.netM 1 Reply Last reply
                        0
                        • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

                          @benjaoming

                          I'll be the devils advocate here. If the protocol forbids scraping, who is to blame in a legal fight if scraping occur? The application provider? Because the scraper never touched anything related to the protocol and never saw if the flag was set on the post.

                          Is the flag to be set at all times? Can you prove that the content was transported by the ActivityPub protocol, and that the scaper definitely was aware of that? Because if you can't, legal action is not possible anyway.

                          This is not about what is right and what is desireable, it's about what is possible to prove in a courtroom and who you can blame for any misdeed. I am pretty sure that a good lawyer will be able to shift the blame to the application displaying the content to the scraper, or at least make the case that the scraper had no way of knowing that this particular content was off-limit.

                          Yes, you can implement an extension of the protocol with the field you talk about, but it will be of no use at all, unless you get all possible Fediverse clients to agree to that field and implement precautions against scraping if the user has set the field. If the scraper is successful anyway, the Fediverse client will be the one to blame for not implementing the precauitions well enough.

                          benjaoming@social.data.coopB This user is from outside of this forum
                          benjaoming@social.data.coopB This user is from outside of this forum
                          benjaoming@social.data.coop
                          wrote sidst redigeret af
                          #12

                          @madsenandersc I think that it's both possible and likely that we would have ActivityPub applications relaying data without the intended consent field. Because of course they exist for both good and bad reasons.

                          But if you have not said or stated anything, it seems terribly unclear what YOUR intention/consent is. So I would basically like a way to say that MY post isn't intended for LLM training.

                          That's just the beginning.

                          benjaoming@social.data.coopB 1 Reply Last reply
                          0
                          • benjaoming@social.data.coopB benjaoming@social.data.coop

                            @madsenandersc I think that it's both possible and likely that we would have ActivityPub applications relaying data without the intended consent field. Because of course they exist for both good and bad reasons.

                            But if you have not said or stated anything, it seems terribly unclear what YOUR intention/consent is. So I would basically like a way to say that MY post isn't intended for LLM training.

                            That's just the beginning.

                            benjaoming@social.data.coopB This user is from outside of this forum
                            benjaoming@social.data.coopB This user is from outside of this forum
                            benjaoming@social.data.coop
                            wrote sidst redigeret af
                            #13

                            @madsenandersc What happens after can be *a lot*. I'm sure that *a lot* of people would agree to let their applications echo this consent.

                            Through logging, you can conclude if an AI scraper visited your endpoints that contained this field and ignored it. That's possible.

                            But it's also possible to advocate litigation or legislation through a proven interest.

                            1 Reply Last reply
                            0
                            • benjaoming@social.data.coopB benjaoming@social.data.coop

                              @madsenandersc The other part of my reasoning is to say: If we don't declare anything, then we've just lost from the beginning. We SHOULD find a way to declare the consent of data sharing wrt. the ActivityPub protocol.

                              Otherwise we have NOTHING. Like literally nothing? Do you know of any kind of existing restriction?

                              madsenandersc@social.vivaldi.netM This user is from outside of this forum
                              madsenandersc@social.vivaldi.netM This user is from outside of this forum
                              madsenandersc@social.vivaldi.net
                              wrote sidst redigeret af
                              #14

                              @benjaoming

                              Ah - I think I see where we fundamentally disagree about all of this. We definitely agree on the desired outcome, but not on the means to get there. 🙂

                              Basically I think it boils down to the fact that you want a technical solution where different actors have a part in the responsibility and possible culpability, where I want a legal solution that isn't relying on a technical foundation, but rather on a legal framework that outlaws scraping content where it isn't explicitly stated that it is available for AI training.

                              That brings me back to the application versus protocol again. You can have an application where the content is the payment, and where you as a user allow AI scraping as a way to pay for the service. It will have to be clearly stated that this is the agreement between you and the service provider, like it is today with e.g. Facebook and others.

                              Think about it: You can either start playing whack-a-mole with flags added to protocols, or you can simply outlaw scraping unless explicitly allowed - protocols be damned.

                              Either way you will need a legal framework to handle this, and a protocol flag without the legal backing is at best worthless ("Do not track"), at worst a sense of false security.

                              I think the fight to get a general ban and a ban per protocol will be just about the same.

                              1 Reply Last reply
                              0
                              Svar
                              • Svar som emne
                              Login for at svare
                              • Ældste til nyeste
                              • Nyeste til ældste
                              • Most Votes


                              • Log ind

                              • Har du ikke en konto? Tilmeld

                              • Login or register to search.
                              Powered by NodeBB Contributors
                              Graciously hosted by data.coop
                              • First post
                                Last post
                              0
                              • Hjem
                              • Seneste
                              • Etiketter
                              • Populære
                              • Verden
                              • Bruger
                              • Grupper