The distinguishing characteristics of those spams is that their subjects are claimed to be in the ISO-2022-JP charset but actually in the SHIFT_JIS charset. And they are encoded in base 64. e.g.
Subject: =?ISO-2022-JP?B?kWaQbJBsjciSsouzk6+NRInvgqmC54LMgqiSbYLngrmBQg==?=The problem is that, according to my observation, because of the false claim, Gmail understands the subject as a random string hence its spam filter doesn't work as it should. Here's how Gmail looks to understand the subject:
The false claim is a result of sloppy understanding of how to compose a Japanese email. It's ironic that the sloppiness works in favor of the spammers against Gmail's spam filter.
I'd really like Gmail to cope with it soon. Let me point out that this spamming technique is not Japanese specific; it can be employed for other languages as well.
Added on 2006-03-20:
At the time I published this posting, I notified Google about it. I don't know how it contributed, but now, Gmail's SPAM filter seems to be able to cope with SPAMs of this kind to some extent.
No comments:
Post a Comment