So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there.
-
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
…was this person not here for any of the various Fediverse meltdowns over scraping/searching/indexing? Or aware of any profile signals that people don't want to be indexed? Or even the least bit curious about robots.txt?
-
…was this person not here for any of the various Fediverse meltdowns over scraping/searching/indexing? Or aware of any profile signals that people don't want to be indexed? Or even the least bit curious about robots.txt?
-
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
@mcc I think the issue is not so much the tool -- anyone can make it, so it was made -- but that any post on social media is inherently public among the people you choose to share with. Since you can't (and probably shouldn't) dictate how they process your posts, you must accept the loss of control. You can't stop a state actor from scraping everything you publish, for example. Your only recourse is to limit sharing.
-
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
@mcc yikes! the way all this is going makes me want to live in a cave!
-
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
@mcc This happens because the premise of the Internet is wrong.
The identifier is e.g. an email address, that can be created by the thousands by one person.... -
Anyway, the fact he's blocked me *partially* solves my problem, in that now he cannot LLM summarize me anymore, but the problem that possibly eventually a *second* person would use his tool remains unresolved.
Honestly, it's baffling that he added Mastodon support at all given that he's been here for years and thus saw some of the MANY YEARS of conflict and debate about the idea of people merely *archiving* or *indexing* Mastodon posts. And then he goes an uploads an auto-LLM-mulcher tool. IDK.
@mcc hey, so full disclosure, I do something similar, completely locally, without sending anything out to any provider, all inference is happening on my machine, and the results are saved to an HTML on my machine only. It loads the top 120 public posts on Mastodon, and filters them to 3 categories as a morning and evening recap of notable events. Just want to let you know that even without any credentials it's possible to see top public posts on Mastodon, which is how I first saw this post too.
-
It is possible he interpreted the way I phrased my request as rude. I may have said something like "you are selling us as meat".
@mcc I had an exchange with someone a few back who used an LLM to "analyze" my posts and then tagged me in this "character profile" (I think to try and shame me into agreeing with him about something stupid?) and honestly it was really fucking weird. I wonder if it was the same guy, because he was an argumentative douche.
-
@mcc hey, so full disclosure, I do something similar, completely locally, without sending anything out to any provider, all inference is happening on my machine, and the results are saved to an HTML on my machine only. It loads the top 120 public posts on Mastodon, and filters them to 3 categories as a morning and evening recap of notable events. Just want to let you know that even without any credentials it's possible to see top public posts on Mastodon, which is how I first saw this post too.
@mcc there's a "quiet public" visibility option which should hide the post from unauthorised clients, maybe that should be the default, with a warning on the public visibility for posting
Edit: Read through the replies, and found out about the #nobot tag, I just added that to my thing to filter out too. I wasn't aware of that.
-
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
@mcc consent? What's that? -
So there's this guy who made a tool where someone punches in their bluesky or mastodon credentials to his website, and it auto-crawls their feeds and produces an LLM summary of everyone it finds posting there. He was asked what people should do if we don't want to be mulched as content for his summary feeds. He said we should block him. I replied, I can do that, but that only stops *you* from running the tool on me, how do I prevent *your other users* from running your tool on me? He blocked me.
@mcc
I blocked the whole domain.That said, everything we write here is public. Anyone, any bot, spider, crawler, benign or malicious organization can read, index and do whatever they want with what we publish (the word itself contains the concept of making public). The only defense is to do a selection of our audience, knowing that there could always be someone going rogue even between our trusted peers.
-
J jwcph@helvede.net shared this topic