Posts Tagged ‘foreign languages’

Gmail and spam: A problem, a suggestion

Friday, June 30th, 2006

GmailJosue Salazar has a problem with Gmail’s spam filter.

After switching from Mail.app to the web-interface to read his Gmail, something odd began to happen:

I started to notice 90% of the email in my Gmail inbox was spam. I marked it as such, wondered what was going on, but in the end I just moved on.

Today, I realized there was something wrong…. I decided to take a quick look at the Spam folder. As expected, all the emails I’d marked as spam on my inbox were there, but to my surprise so were tons of emails from my contacts, and two job offers from days ago. What the hell?

Josue emailed asking if I would mention this onslaught of false positives “to see if someone else is having the same issues I am, or if it’s just me seeing things.” I don’t know the answer, not using Gmail as much as I possibly should.

Dr Drang has a suggestion about how Gmail’s spam filtering could be improved. He is fairly happy; Gmail catches 85% of his spam, but he worries about the other 15%, most of it not in English. It could be solved, he suggests, by

the ability to filter based on the character set used in the message. I cannot read anything written Asian or Cyrillic characters and no one I know would send me such a message, so it must be spam. Back in my Linux days, I used the procmail filter given in the Bogofilter FAQ to eliminate Asian spam before my spam filter even saw it…. The Google folks are generally considered the smartest working on the web today; they should be able to whip up a character set filter in no time.

  • Digg
  • Facebook
  • Delicious
  • StumbleUpon
  • Evernote
  • Share/Bookmark
Tags: , , , , ,