Let’s talk about spam

oranod · April 30, 2025, 3:59pm

Hello!

I’d like to start with this: To everyone who has ever flagged a spam post, thank you!

We launched the forum with the goal of providing the Ansible community with greater autonomy. And we’ve been able to see the trajectory towards a self-supporting community has been “up and to the right”.

Moderation is vital to ensuring the long-term health and continued success of the forum. When you do flag a spam post, you make a contribution to the overall health of the community.

As I’m sure you’ve noticed, though, the amount of spam posts has been on an upwards trajectory too. We don’t want to reach the point where it creates a burden each time you visit the forum. That might lead us to a somewhat dangerous downwards curve.

So let’s talk about how we can address this issue.

We want to find a balance that keeps the barrier to entry low for genuine first-time posters. The need to build up trust before you can post would discourage users looking for help and quick answers. At the same time we want to increase the barrier for the spam bots.

Up to now, we’ve been using the Discourse Akismet to catch spam posts. It does a pretty good job of it. We’ve also adjusted the watched words list to automatically filter and block posts.

The spam posts keep on coming despite those mechanisms. So what else can we do?

Discourse is replacing Akismet with AI spam detection for some of their hosted customers. The AI Spam detection guide explains it all. Coupled with the discourse-automation plugin, it would be possible to configure and use AI triage capabilities.

This seems, at first glance, to essentially offload the chore of flagging spam posts to an LLM. And, as anyone who uses Ansible should know, automating what is mundane to focus on what is meaningful is the happy place.

However would this option deprive community members of a way to contribute?

Does the thought of AI Spam detection make your eyes roll? There are valid reasons to not use an LLM, environmental impact of all that computing power is one. The fact that, in some cases, open-source has been abused to train models could be another reason.

What if, instead, we restrict things to a single post until forum users reach a given level of trust, as @mariolenz has suggested?

@mariolenz also wondered if we could disallow multiple new accounts from the same IP address. We could try lowering the max_new_accounts_per_registration_ip setting in Discourse. Here is a screengrab of the current setting:

So… What do you think?

Cheers,
Don

chris · April 30, 2025, 4:56pm

Yes!

I think both of these suggestions are fine and I’d support that, however I expect it won’t be enough, I’m an admin on multiple Discourse forums and on the most popular one we ended up setting it so that all posts from new users (trust level 0) have to be approved by a admin / moderator, I think this could work here if, for example, some / all the people from the trust level 3 group were made moderators, then there would be enough moderators around so that that people wouldn’t have to wait days for their first post to be approved.

Lyle_McKarns · April 30, 2025, 5:54pm

Seconded, for both of the reasons you already outlined (enviro impact + suspect training models)

You started to address my biggest concern here. Are there a viable number of folks at a sufficient trust level that we can keep the time between post & approval reasonably low?

That leads to questions like ‘what is reasonably low’, and ‘what is a viable number of mods to meet that’

I don’t know that I have answers to either of those. But I agree with the direction here.

chris · April 30, 2025, 6:05pm

I think there are however I don’t have permission to see who is TL3 but I would guess it is around 30ish people?

briantist · April 30, 2025, 6:22pm

Yes, I also share concerns around the use of LLMs.

felixfontein · April 30, 2025, 7:56pm

I don’t like adding short “me too” posts (and the forum doesn’t allow it by requiring a minimum amount of letters to type), but here it is… My eyes are also rolling

Manual approval of first post (or first couple of posts) is probably what will work best, assuming there are enough folks who can approve them. I can never remember what the trust levels are exactly, so no idea whether TL3 is fine/sufficient/…

gundalow · April 30, 2025, 8:04pm

Trust Level 1: 1,487 people
Trust Level 1: 151 people
Trust Level 3: 32 people

Well done @chris

bcoca · April 30, 2025, 9:49pm

Just an information point, when core handled the mailing lists we had it restricted so that first post would always require core approval, even with 3-5 members this was normally taken care of within the same day, at most during winter holiday, people had to wait a couple of days … which, they were not getting their question answered anyways.

Note that the community is larger now, but also the number of people that have enough trust to moderate is also much larger.

russoz · April 30, 2025, 11:01pm

I tend to agree with the general direction this is taking: first-time users need approval. That being said, the effectiveness here is going to depend not only on how many moderators are available, but how many messages need to be moderated - a variable we’ll have no control. So, whilst adopting this procedure would definitely help in the short term, I think we should start thinking of strategies for when the volume of new messages become too big. Because given enough time, they will.

ben.boeckel · May 1, 2025, 5:50am

Note that I couldn’t even flag the spam I saw come in yesterday (it had been there for 3 hours when I tried). Probably because not even TL1? Perhaps (Yubikey) 2FA enablement can be used as a TL1 indicator as well?

felixfontein · May 1, 2025, 6:12am

In the past, spammers were often better at using such security features than most other users, see for example email and SPF/DKIM/DMARC requirements of large email providers, or using TLS certificates for phishing sites.

ben.boeckel · May 1, 2025, 6:22am

Blah. Well, I just got TL1 from that post .

chris · May 1, 2025, 7:48am

I would expect it would probably be fine to automatically grant moderator status to TL3 people (assuming that can be done, I haven’t checked), however I don’t think it should be done automatically as at some point someone might write a AI bot to game the Discourse trust levels — I’d suggest it would be best have a manual process for agreeing who is made a moderator and also a process for warning and then removing moderator status from anyone found to be abusing it.

tremble · May 2, 2025, 7:30am

My general feeling would be:

We have tried some things, they’re probably helping but not catching some particular type of spam which has seen a recent uptick. There’s been a reasonable suggestion that first posts get moderated, and that TL3s (currently 30 of them) are automatically granted moderator status.

To me, this feels like a reasonable next attempt: try it, wait a month and see if it’s harmed participation of new users and re-evaluate with some real data rather than gut feelings.

1 caveate: while granted moderator status, community TL3s should be explicitly told that they’re not under any kind of obligation to do this work. If they’re willing to do it, “great many thanks”, if they don’t have the time, ok.

gundalow · May 2, 2025, 11:35am

@SteeringCommittee I’ve give you all “Moderator” privileges.

As @tremble points out above, there is zero expectation that you spend your time moderating spam.

samccann · May 2, 2025, 6:49pm

As others have noticed, this means you’ll need to turn on 2FA.

gotmax23 · May 4, 2025, 6:53am

Oh, wow, I was just looking through the past review queue. I did not realize how bad the problem was. I’m thankful to everyone on the Community team spending time dealing with this (and other users flagging posts).

When we do have time, how can we help? I see reports pop in the review tab, but what is the procedure for handling spam? In case of these very obvious spam posts, is it just to delete the post and block/delete the user?

Also, has first-post-requires-moderation been enabled yet or is that still under discussion?

felixfontein · May 4, 2025, 12:09pm

That doesn’t seem to be the case yet. (I’m looking forward to this getting activated soon.)

gundalow · May 4, 2025, 6:36pm

That’s exactly what I do.

samccann · May 5, 2025, 2:53pm

So for the first-post-requires-moderation - no afaik it hasn’t been enabled yet.

So the benefits are - to reduce spam
The drawbacks are - first-time user post needs approval.

So I tried to find some stats here.
New contributors this past month - 70
moderator requests (from flagged posts) 93

So, that does suggest it would be worthwhile to turn first-time users need approval on.

Topic		Replies	Views
testing Ansible Project	3	9	September 16, 2014
Where's my post? Ansible Project ansible-project	0	3	June 5, 2015
Can Ansible consider switching from Google Groups to a Discourse instance? Ansible Project	6	13	February 27, 2018
Forums vs mailing list? Here's a short 2-question voting survey. Ansible Project	8	6	January 15, 2014
my post about DEBUG isn't approved yet - can someone assist? Ansible Project	0	2	October 12, 2015

Let’s talk about spam

Related topics