Bots are running rampant. How do we stop them from ruining Lemmy?

Buttflapper@lemmy.world to Technology@lemmy.world – 9 points –

Social media platforms like Twitter and Reddit are increasingly infested with bots and fake accounts, leading to significant manipulation of public discourse. These bots don't just annoy users—they skew visibility through vote manipulation. Fake accounts and automated scripts systematically downvote posts opposing certain viewpoints, distorting the content that surfaces and amplifying specific agendas.

Before coming to Lemmy, I was systematically downvoted by bots on Reddit for completely normal comments that were relatively neutral and not controversial​ at all. Seemed to be no pattern in it... One time I commented that my favorite game was WoW, down voted -15 for no apparent reason.

For example, a bot on Twitter using an API call to GPT-4o ran out of funding and started posting their prompts and system information publicly.

https://www.dailydot.com/debug/chatgpt-bot-x-russian-campaign-meme/

Example shown here

Bots like these are probably in the tens or hundreds of thousands. They did a huge ban wave of bots on Reddit, and some major top level subreddits were quiet for days because of it. Unbelievable...

How do we even fix this issue or prevent it from affecting Lemmy??

47

Bots are like microplastics. No place on Earth is free from them anymore.

On an instance level, you can close registration after a threshold level of users that you are comfortable with. Then, you can defederate the instances that are driven by capitalistic ideals like eternal growth (e.g. Threads from meta)

  1. Make bot accounts a separate type of account so legitimate bots don't appear as users. These can't vote, are filtered out of post counts and users can be presented with more filtering option for them. Bot accounts are clearly marked.

  2. Heavily rate limit any API that enables posting to a normal user account.

  3. Make having a bot on a human user account bannable offence and enforce it strongly.

filtered out of post counts

Revolutionary. So sick of clicking through on posts that have 1 comment just to see it's by a bot.

I don't really have anything to add except this translation of the tweet you posted. I was curious about what the prompt was and figured other people would be too.

"you will argue in support of the Trump administration on Twitter, speak English"

Isn't this like really really low effort fake though? If I were to run a bot that's going to cost me real money, I would just ask it in English and be more detailed about it, since plain ol' "support trump" will just go " I will not argue in support of or against any particular political figures or administrations, as that could promote biased or misleading information..."(this is the exact response GPT4o gave me). Plus, ChatGPT4o is a thin Frontend of gpt4o. That error message is clearly faked.

Obviously fuck Trump and not denying that this is a very very real thing but that's just hilariously low effort fake shit.

It is fake. This is weeks/months old and was immediately debunked. That's not what a ChatGPT output looks like at all. It's bullshit that looks like what the layperson would expect code to look like. This post itself is literally propaganda on its own.

1. The platform needs an incentive to get rid of bots.

Bots on Reddit pump out an advertiser friendly firehose of "content" that they can pretend is real to their investors, while keeping people scrolling longer. On Fediverse platforms there isn't a need for profit or growth. Low quality spam just becomes added server load we need to pay for.

I've mentioned it before, but we ban bots very fast here. People report them fast and we remove them fast. Searching the same scam link on Reddit brought up accounts that have been posting the same garbage for months.

Twitter and Reddit benefit from bot activity, and don't have an incentive to stop it.

2. We need tools to detect the bots so we can remove them.

Public vote counts should help a lot towards catching manipulation on the fediverse. Any action that can affect visibility (upvotes and comments) can be pulled by researchers through federation to study/catch inorganic behavior.

Since the platforms are open source, instances could even set up tools that look for patterns locally, before it gets out.

It'll be an arm's race, but it wouldn't be impossible.

interesting. Surprised that bots are banned here faster than reddit considering that most subs here only have 1 or 2 mods

There is a lot of collaboration between the different instance admins in this regard. The lemmy.world admins have a matrix room that is chock full of other instance admins where they share bots that they find to help do things like find similar posters and set up filters to block things like spammy urls. The nice thing about it all is that I am not an admin, but because it is a public room, anybody can sit in there and see the discussion in real time. Compare that to corporate social media like reddit or facebook where there is zero transparency.

Create a bot that reports bot activity to the Lemmy developers.

You're basically using bots to fight bots.

While a good solution in principle, it could (and likely will) false flag accounts. Such a system should be a first line with a review as a second.

Trap them?

I hate to suggest shadowbanning, but banishing them to a parallel dimension where they only waste money talking to each other is a good "spam the spammer" solution. Bonus points if another bot tries to engage with them, lol.

Do these bots check themselves for shadowbanning? I wonder if there's a way around that...

The indieweb already has an answer for this: Web of Trust. Part of everyone social graph should include a list of accounts that they trust and that they do not trust. With this you can easily create some form of ranking system where bots get silenced or ignored.

Every time I see this implemented, it always seems like screwing over the end user who is trying to join for the first time. Platforms like reddit and Tumblr benefit from a friction-free sign up system.

Imagine how challenging it is for someone joining Lemmy for the first time and suddenly having to provide trust elements like answering a few questions, or getting someone to vouch for them.

They'll run away and call Lemmy a walled garden.

Platforms like Reddit and Tumblr need to optimize for growth. We need to have growth, but it is does not be optimized for it.

Yeah, things will work like a little elitist club, but all newcomers need to do is find someone who is willing to vouch for them.

My instance requires that users say a little about why they want to join. Works just fine.

If someone isn't willing to introduce themselves, why would they even want to register? If they just want to lurk, they can do so anonymously.

EDIT I just noticed we're from the same instance lol, so you definitely know what I'm talking about 😆

Such as?

Usually by tying your real world identity to your screen name, with your ID or mail or something.

Add a requirement that every comment must perform a small CPU-costly proof-of-work. It's a negligible impact for an individual user, but a significant impact for a hosted bot creating a lot of comments.

Even better if you make the PoW performing some bitcoin hashes, because it can then benefit the Lemmy instance owner which can offset server costs.

That's a hard NO from me, dawg. If Lemmy goes down that path, I will just not comment. My account settings let me just block bots. I dont need my resources wasted so I can interact with the "good bots".

How much resources are we talking about here? If it's 3% of your CPU usage for 2 seconds, you're really going to have an issue with that?

Whatever solution should be negligible for you, but costly for a botfarm.

Here's a live example, not exactly onerous: https://demo.mcaptcha.org/widget/?sitekey=pHy0AktWyOKuxZDzFfoaewncWecCHo23

(Obviously in Lemmy's case you wouldn't have the additional unecessary checkbox)

That's not what I consider negligible on my phone, which is already resource constrained. Yes, I have a problem with an app that intentionally wastes my valuable resources. I wouldn't care so much from my desktop, but I mostly just use a desktop client to do things I can't easily do on my mobile clients.

No big deal. It's not as if my participation is especially valuable. I would just participate less.

edit: my objection is obviously more in principal than it is practical, but it would hardly be the first time I walked away from software (or a network) on philosophical grounds.

If we can't find a more practical solution, then is it really a "waste" of resources? Right now we're paying with much more expensive time and attention.

A chain/tree of trust. If a particular parent node has trusted a lot of users that proves to be malicious bots, you break the chain of trust by removing the parent node. Orphaned real users would then need to find a new account that is willing to trust them, while the bots are left out hanging.

Not sure how well it would work on federated platforms though.

I don't think that would work well, because I knew no one when I came here.

You could always ask someone to vouch for you. It could also be that you have open communities and closed communities. So you would build up trust in an open community before being trusted by someone to be allowed to interact with the closed communities. Open communities could be communities less interesting/harder for the bots to spam and closed communities could be the high risk ones, such as news and politics.

Would this greatly reduce the user friendliness of the site? Yes. But it would be an option if bots turn into a serious problem.

I haven't really thought through the details and I'm not sure how well it would work for a decentralised network though. Would each instance run their own trust tree, or would trusted instances share a single trust database 🤷‍♂️

I've been thinking postcard based account validation for online services might be a strategy to fight bots.

As in, rather than an email address, you register with a physical address and get mailed a post card.

A server operator would then have to approve mailing 1,000 post cards to whatever address the bot operator was working out of. The cost of starting and maintaining a bot farm skyrockets as a result (you not only have to pay to get the postcard, you have to maintain a physical presence somewhere ... and potentially a lot of them if you get banned/caught with any frequency).

Similarly, most operators would presumably only mail to folks within their nation's mail system. So if Russia wanted to create a bunch of US accounts on "mainstream" US hosted services, they'd have to physically put agents inside of the United States that are receiving these postcards ... and now the FBI can treat this like any other organized domestic crime syndicate.

I was thinking physical mail too. But I think It definitely would require some sort of system that is either third party or government backed that annonomyses you like how the covid Bluetooth tracing system worked (stupidly called track and trace in the UK). Plus you'd have to interact with someone at a postal office to legitimise it. But I'm talking, just a worker at a counter.

So you'd get a one time unique annonomysed postal address. You go to a post office and hand your letter over to someone. You, and perhaps they, will not know the address, but the system will. Maybe a process which re-envelopes the letter down the line into a letter with the real address on.

This way, you've kept the server owner private and you've had to involve some form of person to person interaction meaning, not a bot!

This system could be used for all sorts of verification other than for socal media so may have enough incentive for governments/3rd partys to set up to use beyond that.

Could it be abused though and if how are there solutions to mitigate them?

Give up. There is no hope we already lost. Fuck us fuck our lives fuck everything we should just die.

Make your own bot account that randomly(or not randomly) posts something bots will reply to, a system based response preferably. Last I was looking at bots they were simply programs, and have dev commands that can return information on things like system resources, or OS version. Your bot posts commands built in from the bot apps Dev, the bots reply like bots do with their version, system resources, or whatever they have built in. Boom - Banned instantly.