We’re Losing The Spam War
This may not be the most popular blog on the Interwebs, but it's one of the oldest. Thanks to that dubious distinction, I seem to be inflicted regularly with the leading edge of spammer technology.
It started with comment spam: someone posting links upon links upon links, usually of porn sites or ads for erectile dysfunction drugs or similarly snake-oily schemes. I'm not sure if there's a person on the planet who would actually see a post like this and remove money from their wallet somehow, but someone must, because I still get those.

Then we have the next step up in spammer intelligence - instead of links, just page after page of search terms. I'm fairly certain this is used to game Google to present useless results for totally unrelated searches, but I'm not really sure how it works. Still, it must, because I still get those too.

Still, these are fairly easy for blog programs to detect and suppress. Which is a good thing, because I get about 400 of these a day. Four. Hundred. A day. And that's on a slow day. Sometimes it goes up to four thousand. During that same time, since this is a relative backwater of the interwebs, I get maybe ...ten? legitimate actual comments.
So, the spammers decided to escalate a bit, and try to conceal their payloads as those ten legitimate comments.

Now blog programs can still make an effort to detect these. The spambots used to post these usually use the same fractured English fragments, and of course they still have to include a link to whatever useless crap they're trying to sell. So there's still some hope of combating this, right?
Gentlemen and ladies, I present to you.... smartspam.

This is in response to the recent Youtube I posted of Sara Brightman singing about her disco love for starship troopers. The comment is actually on the topic posted. The best I can guess, someone is actually paying people (probably through a pay-per-click scheme) to leave somewhat relevant comments with the spam payload.
The result: Bayesian spam prevention, the best hope so far of limiting spam, is broken. Trying to filter this spam via the comment payload now runs a significant risk of blocking legitimate commenters. (Which is already happening - there's one regular commenter whose comments are regularly blocked by the spam filter now.) You could try to block on the link payload... but that changes, and doesn't use obvious key words any more.
Hell, in the past week or so I've gotten comments that I can't even tell are spam or not.
We're losing the war. And eventually the Internet will be nothing but bots selling Cialis Jessica Alba plane ticket porn to each other.
This, then, is the bright new future of new media.



October 20th, 2007 - 09:06
So you can go to jail or be fined for buying drugs or stolen goods off someone.
How long will it be before you can be sent to jail or fined for buying something via a spam link?
October 20th, 2007 - 09:29
Don’t forget to let me inside when you lock up the door for good, Scott! You know my real email.
I had massive spam problems on a forum. One for my gaming guild, in fact. Captchas didn’t slow them down (admittedly my software used fairly lame captchas) and admin approval, sorting one new user from a hundred new spammers, was getting to be a chore. Before I switched to another means — which I won’t detail because if too many people start doing it, it’ll be useless — I used the URL field as a test. I said prominently in the signup form that if you entered any URL, you would not be approved. Then I could just look over the new users and approve only the ones who followed the instructions — i.e., didn’t act like bots. It nailed 100% of the bots, and only got one or two false positives from lamers who can’t follow instructions; it was just too labor-intensive for the long run.
I do find it ironic that you are using WordPress despite your stated dislike of link spam and Google results manipulation (something which, of course, harms all of us who want valid search results). Have you so quickly forgotten about PhotoMatt and his defense of how it’s not wrong for him to make a few bucks by polluting search results and degrading the utility of search for everyone else? And his blog full of glowing “approval” because he deleted any dissenting comments? I won’t use WordPress for that reason alone. Even for something free, sometimes the price can be too high.
If you’ve got a way to kill comments based on whether or not they entered a URL, the idea of hiding the URL field with CSS would be a good stopgap measure.
By the way, the real target of this spam isn’t exactly your users. They’re not where the money comes from. The goal is to get people to land on a page well-populated with Google’s (or any competitor’s) targeted ads, especially if they’re pay-per-impression ads. That’s why the search companies don’t make the effort they could to put a stop to it. They’re making money off this too, as their advertisers have to spend more on advertising to actually get it in front of the eyeballs they want to reach.
October 20th, 2007 - 13:03
its so easy getting rid of all spam.
1. dont allow links (name becomes link to website info)
2. do not allow html in posts
by doing this there will be no seo trix because we cant get backlinks, see how easy it is?
October 20th, 2007 - 18:40
I think you’re a spammer, sola solarium.
October 21st, 2007 - 08:37
The blog software that this site uses is WordPress, which has had comments invisible to search engines for at least the last several versions (2.0, perhaps earlier). There’s no need to make a robots.txt alteration. Once other blogging software follows suit the link spam comments — the ones trying to increase web presence — should stop.
If you’re not concerned about getting a high search engine rating, use the WordPress option to make your entire blog invisible to Google, Technorati, etc. Let people find you by word of mouth, white lists, and so on. If you don’t show up on search engines, the spammers can’t find you and make you a target.
I think many commenters are missing the point. Akismet, et al, filter spam just fine. The problem is if you want to look through the comments marked as spam every day to see if there was a mistake. (It happens.) For a lot of bloggers, myself included, that’s too much work. It takes the fun out of blogging.
October 22nd, 2007 - 09:58
Peer moderation + optional logins is way to combat this. Just follow slashdot model – they have ZERO spam issues.
October 23rd, 2007 - 09:55
Google needs to go to a non link based ranking ontology. I have no idea how the fuck you’d do such a thing, but they’re smart and work at google, they should be able to figure it out.
24% of Americans think Bush is doing a good job, after that 1% of Americans actually shopping via spam no longer surprises me.
October 23rd, 2007 - 10:11
Ultimately, the internet will end up talking to itself, about nothing. Aliens will find this in their search for intelligent life elsewhere in the universe, and decide there is none here, which will save our planet from colonization by bug-eyed beasties from outer space.
Either that or it’s all a commercial plot to close down the internet so a “totally secure, commercial environment” can be plonked down in its stead, and everyone can be required to pay $100/300/month just to have access, before they can pay money for anything else on it.
That’ll be right before western civilization comes to a complete halt and the cockroaches take over.
October 23rd, 2007 - 12:11
the krogpersonal and sola solarium’s above are spam commenters. Sola Solarium explained why he/she/it benefits from posting here due to the link back from a high page ranking site. But it’s just spam per se, with a more personal touch I guess. Both sites are in Swedish (I think).
I like the articles you post about these issues because it’s both alarming and interesting at the same time (to me anyway). Is this all the internet is destined to become, a cesspit of useless spam? And I’m firmly for net neutrality but some new systems/tools need to be made to combat this.
October 23rd, 2007 - 18:34
I guess you just need to start blocking any kind of links at all.
Of course, the time spent on such efforts is part of why I still don’t blog myself!
October 26th, 2007 - 01:47
Argh; did your spam post earn you a whole slurry of spam trackbacks too, Scott?