<span class="caps">LLM</span> Assisted Moderation

Moderating online forums is labor intensive. Burnout is common. It’s almost a given that if a subreddit gets to a certain size, the quality will drop, even with active moderation - moderators simply cannot handle the scale.

Here are some ways LLMs can help out.

Enforcing Forum Rules

Whenever a user writes a comment, it, along with some context and the forum rules is sent to an LLM. The LLM is tasked with scanning the comment for violations of rules. Should any be found, it reports back to the user as well as logs it to a database. The user is informed what the LLM thinks the infractions are, and has the choice to amend his comment. Should he choose not to, the comment is still posted to the site.

This is soft enforcement. The LLM will not prevent the comment from going through.

A human moderator has access to the violations database. If the comment indeed did violate the rules (as per the human’s judgment), the moderator can check if the commenter was warned about it (and how). It’s one thing if you’re not aware of the rules (most commenters aren’t), or misinterpreted the rules. It’s another when you’ve been notified and you proceeded regardless.

Bonus: When the LLM warns the user, we can make the user enter an “appeal” of why his comment is not in violation before letting the comment through.

Additionally, when there are too many comments for the human moderator to feasibly examine individually, he can request all flagged comments from the DB and examine those first.

BTW, what context should we send to the LLM? My first attempt would be to send the ancestor comments all the way to the root node. If you hit some context limit, you can stop at a certain ancestor.

Real World Implementation

The above is assuming the forum is responsible for integrating the LLM. But there’s a potential business idea here. A third party company could manage all this on behalf of several forums. The forum owner uploads the rules to the intermediary, and will allow the comment to go through only when the intermediary gives the OK. The intermediary handles the interaction with the user. The forum moderators will have access to the moderation log.

Who Pays For This?

LLMs aren’t cheap. Having every comment go through an LLM will be expensive. 99% of forums cannot afford this solution.

So we make the user pay. The latest GPT4 Turbo is fairly cheap. For the vast majority of commenters, it will probably cost a few pennies per month.

I haven’t come up with a scheme where the user has his own account with OpenAI (or whichever LLM provider is used), where they’ll pay, yet somehow the forum/intermediary can validate the interaction.

Simpler for everyone is that each commenter pays the intermediary, and the intermediary handles all transactions with the LLM provider. The intermediary can also charge the forum owners a nominal fee for storing their rules and the database of violations.

Of course, the intermediary could run their own fine tuned LLM and charge for it.

You may balk at a scheme that requires commenters to pay. In my humble opinion, the quality of the dialogue will go up vastly if a commenter did have to pay. You’ll lose most low effort comments. My real worry is the opposite: That rates are now so cheap it won’t be enough of a deterrent.

The intermediary can experiment with variable rates. If your comments get flagged often enough, the amount you’ll be charged will increase exponentially (and go down with periods of good behavior).

There is an equity concern here: The pricing may end up excluding people from less wealthy countries.

Did He Read The Article?

An annoying trend with forums like Reddit and Hacker News is commenters not reading the article submitted, and commenting purely based on the submission title. Often these comments are of little value: They may be unrelated to the submitted article. Their comments are often addressed in the article, etc.

I’ve seen submissions where it is obvious that easily half of the comments are by people who didn’t read the article.

How can we improve this? Semantic CAPTCHAs. [1]

When an article is submitted to the site, the article’s contents are sent to the LLM with a request to make 50-100 multiple choice Q/As based on the content of the article. The submitter is presented with them and is asked to select 5 that are good questions - and ones where the LLM is accurate.

When someone leaves a comment, he is asked one of these 5 questions. If he picks the wrong answer, his comment does not go through. If he does answer correctly, he is asked to select a few more appropriate Q/As from the remaining pool of questions (i.e. excluding the original 5). Then his comment goes through.

Once a commenter has answered correctly, he is not harassed for future comments in this submission.

This process repeats for each new commenter until we have, say, 20 Q/As. After that, new commenters only need to answer the question correctly and will not be asked to select more questions.

You need some scheme to ensure poor Q/As aren’t being chosen. A given question may need to be selected multiple times before the system decides it is a good question. If too many people get a particular question wrong, it can be dropped and replaced with another. These algorithms will need to be tweaked.

The prompt used for this will be tricky. You don’t want too many obscure questions. Nor do you want too many silly answers where anyone can simply guess the answer by eliminating the nonsensical choices.

Who Pays For This?

This is a lot cheaper than the moderation scenario. The LLM is invoked only once - when the article is submitted. In principle, the forum owners could pay for this.

Embarrassingly Not Properly Reading The Comment I Was Replying To

Ever written an adversarial response that was violently in agreement with the comment being responded to? Or missed a subtlety in that comment that already addresses your comment? Or simply misread the comment and interpreted it as something quite different?

I’ve been there. It’s embarrassing.

I’d like a local browser plugin that would send my comment, along with a few of the ancestors to the LLM with a prompt asking to check the following:

Is my comment disagreeing with a comment that is in agreement with me?
Is my comment off topic?
Is my comment redundant? Was it already addressed by the parent?

I’d probably need to experiment with the prompt to get what I need.

Unlike the above two scenarios, this one is simply local to my machine. It would save a lot of embarrassment, and prevent poor quality comments from being posted.

[1]	Yes, yes - there is irony in using the word “CAPTCHA”, which stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”

LLM Assisted Moderation

Enforcing Forum Rules

Real World Implementation

Who Pays For This?

Did He Read The Article?

Who Pays For This?

Embarrassingly Not Properly Reading The Comment I Was Replying To