Hello!
I’d like to start with this: To everyone who has ever flagged a spam post, thank you!
We launched the forum with the goal of providing the Ansible community with greater autonomy. And we’ve been able to see the trajectory towards a self-supporting community has been “up and to the right”.
Moderation is vital to ensuring the long-term health and continued success of the forum. When you do flag a spam post, you make a contribution to the overall health of the community.
As I’m sure you’ve noticed, though, the amount of spam posts has been on an upwards trajectory too. We don’t want to reach the point where it creates a burden each time you visit the forum. That might lead us to a somewhat dangerous downwards curve.
So let’s talk about how we can address this issue.
We want to find a balance that keeps the barrier to entry low for genuine first-time posters. The need to build up trust before you can post would discourage users looking for help and quick answers. At the same time we want to increase the barrier for the spam bots.
Up to now, we’ve been using the Discourse Akismet to catch spam posts. It does a pretty good job of it. We’ve also adjusted the watched words list to automatically filter and block posts.
The spam posts keep on coming despite those mechanisms. So what else can we do?
Discourse is replacing Akismet with AI spam detection for some of their hosted customers. The AI Spam detection guide explains it all. Coupled with the discourse-automation plugin, it would be possible to configure and use AI triage capabilities.
This seems, at first glance, to essentially offload the chore of flagging spam posts to an LLM. And, as anyone who uses Ansible should know, automating what is mundane to focus on what is meaningful is the happy place.
However would this option deprive community members of a way to contribute?
Does the thought of AI Spam detection make your eyes roll? There are valid reasons to not use an LLM, environmental impact of all that computing power is one. The fact that, in some cases, open-source has been abused to train models could be another reason.
What if, instead, we restrict things to a single post until forum users reach a given level of trust, as @mariolenz has suggested?
@mariolenz also wondered if we could disallow multiple new accounts from the same IP address. We could try lowering the max_new_accounts_per_registration_ip setting in Discourse. Here is a screengrab of the current setting:
So… What do you think?
Cheers,
Don