Opened 17 years ago

Last modified 16 years ago

#180 reopened Enhancement

spam filtering dislikes pirates

Reported by: Eric Myers Owned by: davea
Priority: Minor Milestone: Undetermined
Component: Web - Forums Version:
Keywords: spam Cc: romw, davea

Description

I tried to post a comment in boinc_dev forums and it was blocked by the spam filter, presumably because it had a link to Pirates@Home glossary.

(And I just wanted to try the ticket system.)

Change History (12)

comment:1 Changed 17 years ago by KSMarksPsych

I'll add it also dislikes Einstein (first identified by Jord). Here's a copy of an email I sent to David and Rytis with some testing.

I ran into Akismet this morning in this thread.

http://boinc.berkeley.edu/dev/forum_thread.php?id=1784

I had tried to reply to the post from Doug Worrall.  When I tried quoting it and adding my response Akismet blocked it.  However it did let me go back and edit the post.  I took out the quoted part and it went through fine.

There were no URLs in the quoted part or my post.  There wasn't even anything spam-ish in Doug's post (unless you count spelling mistakes but then I'd never get anything through).  What I find interesting is that even if I reply with just the word "test" to that post then Akismet doesn't like it.  But if I just quote the post and hit reply (without saying anything) it goes through just fine.

Testing out Jord's response.  If you copy/paste it into a brand new test thread.  It refuses to go through.

If you pick a thread at random (I used http://einstein.phys.uwm.edu/forum_thread.php?id=5686) from E@H and embed it, it refuses to go through.  If you just put it between url tags (without linking it to a word or phrase) it refuses to go through.  If you just copy/paste it without url tags it refuses to go through.  It won't even let E@H's front page URL go through.

Picking another project at random (I used Rosetta...  this thread http://boinc.bakerlab.org/rosetta/forum_thread.php?id=2989http://boinc.bakerlab.org/rosetta/forum_thread.php?id=2989).  It goes through fine linked to a word, between url tags and as plain text.

comment:2 Changed 17 years ago by Rytis

Cc: romw@… added

It seems that Akismet does not really work as well as we hoped. Rom, I guess we need to disable in /dev/ forums, since it's doing more harm than good.

The code will stay, maybe we will be able to turn it backon if Akismet improves over time.

comment:3 Changed 17 years ago by romw

Cc: davea@… added

Well, all we know for sure is that without training it is having problems.

I don't think it'll ever improve over time. Until we can train Akismet on what constitutes normal operations for BOINC based forums, it'll neve know.

Sooner or later we are going to have to pay the web-based forum spam protection tax. Either with spambayes or Akismet.

comment:4 Changed 17 years ago by KSMarksPsych

I'll add a bit more data.

I went through all of the BOINC projects I have bookmarked (their front page URLs as those are generally the attach URL) to see which Akismet likes and doesn't like.

  1. Einstein (it won't let the front page or a forum URL go through)
  2. Predictor (it won't let the front page or a forum URL go through)
  3. SAP (it won't let the front page go through, but if you remove the /index.php (which I think was the attach URL) it goes through and a forum URL will go through)
  4. Sztaki (it won't let the front page or a forum URL go through)
  5. QMC (it won't let the front page go through, but if you remove the /index.php (which is the attach URL) it goes through and a forum URL goes through)
  6. Pirates (front page URL and a forum URL go through, but a link to their Wiki won't go through)

Unlike others, I didn't have a problem getting anything to do with CPDN (the attach URL, the BOINC gateway URL, the beta URL or the phpBB URL) to go through.

For those of you with moderator privileges on boinc_dev, the thread I used for testing is here.

comment:5 Changed 17 years ago by Rytis

Status: newassigned

I will add a feature for mods/admins to mark their own post as ham (that's Akismet slang for "not spam"). Hopefully it will clear problems for simple users, too, once the mods mark the URLs as safe.

comment:6 Changed 17 years ago by MikeMarsUK

Akismet was briefly enabled on the CPDN forums, but disabled again (much more bother than it was worth).

For it to be a useful feature, I would suggest the following:

  • Posts should be flagged not blocked. People are less than happy if they've spent 20 minutes writing a post and it disappears. This is the reason I stopped commenting on Rom's blog which presumably uses the same system (several posts blocked in series, and none was undeleted).
  • Perhaps if EITHER credit > 100 or RAC > 1 then the system should assume the post is good regardless of what Akismet says (automatically mark it as 'ham' to help train Akismet, without involving mods / admins). Doesn't take account of stolen user accounts though.
  • The message should tell the user that their post needs to be reviewed but will appear on the website shortly (i.e., the message should imply that it's good, spammers won't care what the message says anyway, so why phrase it as if the user was a spammer?)
  • Moderators should be able to mark posts as OK so the training works ('ham')

comment:7 in reply to:  5 Changed 17 years ago by Ageless

Replying to Rytis:

I will add a feature for mods/admins to mark their own post as ham (that's Akismet slang for "not spam").

Is this option ever coming?

comment:8 Changed 16 years ago by Ageless

Owner: changed from Rytis to davea
Priority: MinorMajor
Status: assignednew

Reassigning to David.

comment:9 Changed 16 years ago by davea

Resolution: invalid
Status: newclosed

This is not a bug in BOINC. Akismet is optional. Feel free to modify the relevant Wiki page.

comment:10 Changed 16 years ago by Didactylos

Priority: MajorMinor
Resolution: invalid
Status: closedreopened
Type: DefectEnhancement

Reopening, and changing to enhancement. If it's not a bug, then at least it's something members want to see improved. This ticket has lots of relevant information, so I'm recycling it.

comment:11 Changed 16 years ago by Nicolas

There is no way to train Askimet. If Askimet says a post is spam, the project admin or mods have no way around it. If Askimet misses a spam post, a moderator can hide it, but can't mark it as spam to improve the filtering.

Those aren't Askimet flaws, but flaws in the way it's connected with BOINC.

comment:12 Changed 16 years ago by Nicolas

Cc: romw davea added; romw@… davea@… removed
Note: See TracTickets for help on using tickets.