This may not be the most popular blog on the Interwebs, but it’s one of the oldest. Thanks to that dubious distinction, I seem to be inflicted regularly with the leading edge of spammer technology.
It started with comment spam: someone posting links upon links upon links, usually of porn sites or ads for erectile dysfunction drugs or similarly snake-oily schemes. I’m not sure if there’s a person on the planet who would actually see a post like this and remove money from their wallet somehow, but someone must, because I still get those.

Then we have the next step up in spammer intelligence – instead of links, just page after page of search terms. I’m fairly certain this is used to game Google to present useless results for totally unrelated searches, but I’m not really sure how it works. Still, it must, because I still get those too.

Still, these are fairly easy for blog programs to detect and suppress. Which is a good thing, because I get about 400 of these a day. Four. Hundred. A day. And that’s on a slow day. Sometimes it goes up to four thousand. During that same time, since this is a relative backwater of the interwebs, I get maybe …ten? legitimate actual comments.
So, the spammers decided to escalate a bit, and try to conceal their payloads as those ten legitimate comments.

Now blog programs can still make an effort to detect these. The spambots used to post these usually use the same fractured English fragments, and of course they still have to include a link to whatever useless crap they’re trying to sell. So there’s still some hope of combating this, right?
Gentlemen and ladies, I present to you…. smartspam.

This is in response to the recent Youtube I posted of Sara Brightman singing about her disco love for starship troopers. The comment is actually on the topic posted. The best I can guess, someone is actually paying people (probably through a pay-per-click scheme) to leave somewhat relevant comments with the spam payload.
The result: Bayesian spam prevention, the best hope so far of limiting spam, is broken. Trying to filter this spam via the comment payload now runs a significant risk of blocking legitimate commenters. (Which is already happening – there’s one regular commenter whose comments are regularly blocked by the spam filter now.) You could try to block on the link payload… but that changes, and doesn’t use obvious key words any more.
Hell, in the past week or so I’ve gotten comments that I can’t even tell are spam or not.
We’re losing the war. And eventually the Internet will be nothing but bots selling Cialis Jessica Alba plane ticket porn to each other.
This, then, is the bright new future of new media.



#1 by VPellen on October 18th, 2007
I refuse to believe that people give -that much of a shit- about Paris Hilton.
#2 by Dren on October 18th, 2007
So their goal is to just get their URL in front of us? I don’t get it. I’ve never even thought about touching anyones links on their comments here or anywhere.
I’m just not that bored.
#3 by nwithers on October 18th, 2007
When the first truly artificial intellegence arrives, it will be a spambot…..
God help us all…..
#4 by Jason Powers on October 18th, 2007
lum- you’ll have to turn on captchas for the comments to protect yourself. As for your email, if you have control over your own server (or even if it’s hosted with someone like ensim), you can install dspam on there and teach it to filter out the spam before it hits your throughput twice.
#5 by Spaz on October 18th, 2007
Eventually, we’ll all have to go back to walled fortresses.
Even the smallest blog that doesn’t want to turn into a cesspool of spam shitnibblets will have to be password-protected and invite-only. I know that sounds repugnant to everybody that had this utopian dream for the Internet and information sharing, but I don’t see another way.
#6 by Brask Mumei on October 18th, 2007
It seems the best temporary approach is just to axe the website field. That at least saves you the pain of determining if it is spam or not – the fluffy warm spam comments can be accepted as fact and make your day brighter rather than serving to warn of the oncoming apocalypse.
#7 by Boanerges on October 18th, 2007
The problem is that spammers have realized that they can circumvent any spam protections by hiring cheap human labor in Asia to post their spam for them (I know this because once in a while one gets through my forum captcha and email verification). So even captcha becomes useless after a while. And some bots can read captcha too.
The goal is traffic. They want you to go to their site to
A. Sell you something
B. Infect your computer so they can turn it into a zombie that sends spam
As far as Paris Hilton… it’s a holdover from when the top search term on the net was that video of her and her lover. I doubt spammers care that much about her either.
#8 by mandlar on October 18th, 2007
That is frickin’ ridiculous.
Brask makes a good point though, no website field and your spam suddenly becomes inspiration to write more.
Spam won’t stop until people get smarter. It only takes one in a million people to actually fall for the trap for it to be successful.
#9 by Heartless_ on October 18th, 2007
Whats even sadder is that spam is moving away from “please click me” to “here is an advertisement” right in front of your face. Blogger users are starting to see widespread comment spam that has nothing to do with links and really tries hard to sneak advertising into legitimate looking comments. I manually delete a dozen or so comments a day that are just amazingly offbase, but contain absolutely no links. And if Lum is the backwater, I am just a cinder block someone threw in the pond.
#10 by HitNRun on October 18th, 2007
Our next layer of defense will be when the Internet generation starts becoming sexually impotent in enough numbers to exert some pressure on Cialis.
#11 by Diamonds on October 18th, 2007
Sorry but I really feel this post about spam is sensationalism.
In all aspects of any society, it is a given that there are enough people so that all avenues of revenue will be tried, all doors will be opened, and all means of gaining an advantage over the other individual will be taken.
Just assume that the other 4294967295 IP addresses are out to get you. Turn on your capatcha, lock your front door, and cancel your credit card when you loose your wallet.
The internet is not a utopia.
(FYI: Capatcha’s are defeatable, but they require more CPU and development time => more expensive to spam)
#12 by kalain on October 18th, 2007
Having trouble keeping your spam filter up? Try Cialis!
… Sorry, I had to.
#13 by Rasputin on October 18th, 2007
Nuke them from orbit.
It’s the only way to be sure.
#14 by Scott Jennings on October 18th, 2007
> Sorry but I really feel this post about spam is
> sensationalism.
Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.
Another fun statistic: 90% of all email on the internet is spam. The basic, core internet service largely consists now of bots fighting each other over whether or not it’s the 10% of the email that makes it into your in-box, simply because it’s “sensationalism” that we should do something about email being subverted wholesale by sleaze peddlers.
It’s not sensationalism; it’s why comments here and elsewhere will shortly go invite-only, simply because all the tools you glibly recommend we use to “lock our doors” are broken.
#15 by Igniferroque on October 18th, 2007
As best I recall, Google bases some of their search algorithm on the number of links to your page. The more links you have, the higher up your listing appears. So the need isn’t for the person looking through the comments to click on the link. It just needs to be there for the Google bots to pick up.
#16 by Stupid on October 18th, 2007
I don’t allow comments on my blog for just this reason. IMO, a blog is not a discussion forum. My blog is a way for me to tell -all- of my friends about relevant issues and news about me without having to tell all of them individually. It is, by its very nature, a one-to-many communication. If a specific reader has a comment, they are free to email me personally. I’m much more likely to respond to that.
I just dont feel that blogging should be a community-building activity.
If a blogger is looking to have feedback on an article, they should invest in a web-based bulleting board. Simply set it up so that the blogger/admin is the only account that can create new threads and you’ve just mimiced the typical blog+comment mechanic.
I’m not sure that would be any better WRT spam, but I think that’s my point.
#17 by Amber on October 18th, 2007
Stupid commented:
Comments on blogs are apparantly for irony.
#18 by David Lynch on October 18th, 2007
I’ve had good results from just using akismet (wordpress’s spam-filter). The stuff that gets through is the weird maybe-spam stuff — the “This is great page” style comments on entries that don’t make sense, without any obvious links to products.
Invitation-only sounds too stifling to be plausible. I’d expect to see more in the way of smart manipulation of comment forms — fields named “url” that are hidden with CSS and cause the comment to be rejected if they’re filled in. That sort of thing.
#19 by TPRJones on October 18th, 2007
“Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.”
Yes, but this is exactly how you defeat spam. You don’t beat it by making unbreakable filters, because there is no such thing and never will be. You break it by raising the price. Even with broken captchas, the time and effort goes up a fraction. That fraction increases the cost and lowers the profitability by a small margin. Every time we can slice off a little more of that margin, we make it less worthwhile for spammers, which is the only way to kill spam.
The only other way to stop spam dead would be real honest-to-god smart AI filters. But considering that it’s the spambots that are more vigorous, if anyone is to accidently develope the first spontanious AI it’s more likely to be born from the spambots themselves.
Self-aware spambots. Now there’s a creepy thought for the day.
#20 by TPRJones on October 18th, 2007
“Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.”
Yes, but this is exactly how you defeat spam. You don’t beat it by making unbreakable filters, because there is no such thing and never will be. You break it by raising the price. Even with broken captchas, the time and effort goes up a fraction. That fraction increases the cost and lowers the profitability by a small margin. Every time we can slice off a little more of that margin, we make it less worthwhile for spammers, which is the only way to kill spam.
The only other way to stop spam dead would be real honest-to-god smart AI filters. But considering that it’s the spambots that are more vigorous, if anyone is to accidently develope the first spontanious AI it’s more likely to be born from the spambots themselves.
Self-aware spambots. Now there’s a creepy thought for the day.
#21 by Demonix on October 18th, 2007
On august 18th, 12:47AM, SpamBot became self aware…
#22 by TPRJones on October 18th, 2007
“Invitation-only sounds too stifling to be plausible.”
Agreed. When you get down to it, though, it’s URLs that are the problem for comment spam. One simple solution that comes to mind is to automatically screen any comments with URLs (and comments that fill in the “website” box). Then when you unscreen, you also have the option to add the URL domain to a list of acceptable domains that aren’t screened in the future (thus quickly stopping the screening from your regular posters’ own URL linkbacks as well as youtube and the like). That would help, no?
#23 by Ethic on October 18th, 2007
Akismet + Bad Behavior works like a charm, for now…
#24 by Soulflame on October 18th, 2007
There’s an article on /. that talks about the email spam problem. According to the article, over 95% of all email is now spam, and the past few months, some sysadmins have seen 98%+ of all emails being spam.
It’s an unfortunate problem with no easy solutions, people being only human and all.
#25 by tannenburg on October 18th, 2007
In the old days, people had to be irritating jerks in person, selling Amway door-to-door…now, thanks to the miracle of computerization and the Internet, people can be irritating jerks through automation.
I, too, am amazed that there are enough people out there who actually buy the crap that spammers sell, enough to make this whole insane mess profitable…
#26 by Ideas on October 18th, 2007
“It seems the best temporary approach is just to axe the website field.”
I’d agree if doing that would stop someone from posting spam, but we know they’ll put it in the body of the comment if there isn’t a field for it. What I think might work better is to keep that field in there, but state in plain english that filling in that field will send your comment to the bitbucket.
#27 by Tim Keating on October 18th, 2007
Fuck captchas! They suck. I cannot tell you the number of times I’ve actually failed to prove via a captcha that I am an actual human being . . .
No I think requiring OpenId is the right approach, & then having shared whitelists among communities of bloggers for known good people.
TK
#28 by Juan on October 18th, 2007
And yet despite all this my mailbox is still filled with 80% junk mail. Sometimes it doesn’t even make it through the slot because the mailman just took the wad and shoved it in, getting it stuck. So I go home, grab the wad out of the mail slot, walk in, throw 90% of it away, log in to my computer and emtpy all of my bulk mail folders (typical day: 3 new messages in inbox, 95 in bulk mail folder on Gmail, and 2 of my inbox messages are still spam)
I’m all for requiring logins for comments, though looking on a couple message boards even that doesn’t work all _that_ well.
#29 by L'Emmerdeur on October 18th, 2007
I get junk mail for the previous 5 or 6 residents of my apartment.
The driver of my shuttle bus likes to play the local CBS radio news affiliate, which from what I gather is 80% ads and 20% sponsor-supported content. With mp3 and CD players these days, why does anybody even own a radio?
Gmail does a great job of filtering spam, maybe one or two get through per month – out of thousands.
However, if spammers are forced to use real people to get through automated defenses, the volume of spam will be cut dramatically (maybe by 99%). They would have to hire half the world population and have them work 18 hour shifts to generate comment spam at even a fraction of the rate of a bot (and all those computers running at the same time will bring on the premature cold death of the universe).
#30 by tannenburg on October 18th, 2007
This all just fuels my conviction that 99% of all advertising – spam, junk mail, TV and radio ads – is wasted. Still, that 1% must be spending a metric butt-ton of money to keep the whole insane mess rolling along…
#31 by Diamonds on October 18th, 2007
>It’s not sensationalism; its why comments here >and elsewhere will shortly go invite-only, simply >because all the tools you glibly recommend we >use to “lock our doors” are broken.
Not broken, but not complete. They do their part.
I recommend this experiment. Create a vBulletin forum with sample trash data. Link to it at various places but don’t bother having people post. Just fill it up with trash.
- Allow guest posting and see how much spam you get.
- Turn on registration only but allow anyone to register, and see how much spam you get.
- Turn on your CAPATCHA for registration, repeat.
No less than three months ago I did this scenario and after 1 week of being live (with a new URL) I got:
- Registration on, no CAPATCHA: 1 or 2 spam posts a day.
- Registration on, CAPATCHA required: 1 spam post a month.
Inherit protocols with the internet allow spam, much like the game mechanics of MMOs will allow gold farmers. Equally comparable is that , regulation and prohibition won’t combat either.
“Something will work to block spam until the spammers defeat it.”
Well duh. Just like in the encryption world, any particular method is only as good until it breaks! (example: AES encryption is only as good as long as there are only deterministic computers)
1) Create a counter -> counter gets countered.
2) Declare the internet to be broken!
3) ???
4) Profit!
#32 by Diamonds on October 18th, 2007
* Inherit protocols with the internet allow spam. The fault is underlying principal that spamming generates more money than it costs. The same thing is comparable tothe game mechanics of MMOs which have gold farmers.
^^^ correct paragraph ^^^
I understand that you’re pissed about spam but dammit you could have done something to combat it in the time you made your post. Your solution (whatever it may be) probably would have combatted all spam for the rest of the blog.
I also assume that you responded to my comment because I said the dirty S (enationalism) word. You don’t strike me as a sensationist person, but really, reflect on your post and consider why you titled it “We’re Losing The Spam War,” aside from being snide (an aspect of your blog which I enjoy btw).
#33 by Scott Jennings on October 18th, 2007
I understand that you’re pissed about spam but dammit you could have done something to combat it in the time you made your post. Your solution (whatever it may be) probably would have combatted all spam for the rest of the blog.
Actually, no, since the blog’s remotely hosted at wordpress.com now, I’m pretty much forced to choose between (a) forced logins which will eventually be broken anyway I’m sure, and (b) relying on the default spam filter which is currently breaking down completely.
#34 by D-0ne on October 18th, 2007
Pivot blocks spam. http://www.pivotlog.net/
Seriously.
#35 by hellfire on October 18th, 2007
Whitelists work right up until the point when someone pays enough money to get on them.
I host my wordpress blog on Dreamhost, BadBehavior/Akismet has keep all but 6 posts from landing in the 2 years since I blitzed Movable Type.
The ones that got through? They came from registered and CONFIRMED users. My only solution (which likely wouldn’t work at all for you) was to blitz the user list and go to double-moderated users only. I have to approve your user before you can comment and approve your first comment. After that, you can post at will.
It’s ridiculous and the result is that no new comments have been posted since as the folks that would have posted something just sent me an email instead.
#36 by hellfire on October 18th, 2007
Also: I don’t know how much theme flexibility you get with a hosted wp blog, but a few judicious theme edits would remove users mail/url without much trouble. You just need to pass a couple optional parms to the php call that drops user info – it’s documented pretty well in the WP.API codex.
#37 by krogpersonal on October 18th, 2007
Im not sure why you publish my comment on your massive blog as spam with BOTH my work email and my ip-number.. but it doesent feel right.
Did i do something wrong here?
#38 by Scott Jennings on October 18th, 2007
I guess that would prove my point that I can’t tell when something is or isn’t spam!
#39 by Phillip Longmire on October 18th, 2007
how many hits do you get a day?
#40 by Boanerges on October 18th, 2007
It’s important to note that email spam and comment/forum spam have two distinct goals. Email spam is like a roman candle. You send out millions of emails and if it turns out only 0.0001% responded you send out millions more the next day. Who cares if everyone else deleted it? It’s cheap and easy.
Comment/post spam has a different goal. In this crazy place we call the Internet, most people use search engines to get around. Search engines use, as part of their algorithm, links. The more often they find a link to your site, the more likely you are to rank well when people do go looking for whatever it is you’re selling. Unlike email spam, where the goal is action by the end user, comment/post spam’s goal is mostly search engines because, somewhere, someone has set up a blog/forum/look-at-me page and isn’t savvy or concerned enough to delete that link to that less-than-legit site. They don’t want your clicks (but they’ll take them), they want the spiders to notice them. Get enough links that way and you too can win the biggest MMO of them all: The Internet.
#41 by Michael on October 19th, 2007
It’s not even about advertising or that one in a million idiot who ponies up. It’s the same thing as covering stop signs with opaque black paint. It’s simple vandalism of the oldest kind — “look what I can do to you and you CAN’T STOP ME.” The spammers don’t care a whit about getting responses, they amuse themselves getting anonymous control over you. Even before you opened your mailbox, they’ve already received their reward, their payoff, knowing they’ll waste half an hour of your time and force you to wall yourself in with filters and white lists. They entered your electronic house and stole the only thing you had of value — your privacy. I recently started working at a company that has the best security in the business and I’ve never posted/published my company-issued alias anywhere — yet I still get spam there.
#42 by Andrew Crystall on October 19th, 2007
TPRJones – Actually that’s not true. Because the spammers ain’t paying. The botnets they use mean that other people are.
#43 by sanyaweathers on October 19th, 2007
I used to have a real job many years ago where we sent junk mail to potential donors. (I did fundraising for a local theater by day, production by night.) As the lowest level member of the fundraising team, I got stuck with direct mail. Yes, even the people who inflict it hate it, and prefer cozy little auctions and snuggle time with known donors.
We bought or swapped mailing lists with local orchestras, art museums, etc. And if we couldn’t arrange a purchase or a swap, my job was to go snag a program and copy, BY HAND, the list of major donors. And cross reference with our own major donors since they tended to be the same people. And use the phone book to match addresses with the names of the handful that didn’t cross reference. I did have a very early computer at this job, so I could cross reference without too much pain, and at the end I could auto-generate the address labels.
All that effort was towards the holy grail of 2% returning my postage paid envelope with a check enclosed. The month I got 3% back I was taken to dinner and celebrated as a god.
All that work for 2%. If all I had to do was click a few buttons to send my bot on its way to get 1% return? It would have been a MASSIVE profit to my company. Shrimp nets are faster than targeted hook and line setups. Because of this, spam is not going to be stopped by any of the means bandied about here except invite only filters. The things some of you suggest would have to increased by many orders of magnitude to affect the margin I explained above.
#44 by Guido Jones on October 19th, 2007
Captcha’s are only sucessful until you’ve been targetted by the spammers. As others have mentioned, they’re defeatable by image readers, low cost labor over seas and one that I don’t think anybody mentioned – Porn click through schemes.
Plus, if you have a site large enough you have to provide a captcha with a version for the visually impaired, which is way easier to break than the image version.
The other thing you guys are missing – for most spammers there is almost 0 cost to doing these things. All the spam is coming from infected zombie systems.
#45 by Jeremy Williams on October 19th, 2007
Well, if the first AI existing is going to be a spambot… then the only way to battle it will be the creation of an AI spambot slayer!
And then, when they battle it out and eventually the anti-spam-bot AI wins and takes over the Matrix, at least we’ll be add free?
#46 by wowpanda on October 19th, 2007
The best way I found is to moderate comments, so the comments won’t be seen by others till the blog owner allows it.
Another interesting thing I observed on one of my tiny traffic forums. Initially anyone can post. Soon I noticed spams for porn sites and some very strange links. So I set it to only allow posting from registered users. After a few days spams apear again, and the speed of spam increase graduately surpass my patience to delete them. I then set it to only active a user after I approves it (Yes, the spam bots pass the images etc easily), that did the trick. Even thought I have to mass approve the users every night (most of them with strange names so might be spamers), there is no spam on my site any more.
That only tells me one thing. If you time delay user activation, the spamers won’t be able to get you. I think the reason is the spamers apply for email addresses and are forced to give it up after certain time (maybe after they got reported and email account baned).
#47 by =j on October 19th, 2007
From L’Emmerdeur:
why does anybody even own a radio?
For NPR and the local college radio station, of course.
As far as stopping the blog spam? Keeping the comment pages in your robot.txt should stop the more intelligent spammers. Though there is no promise of intelligence when it comes to bots. Why waste even $0.00001 on a comment that will NEVER get indexed? This and forbiding html links would go a long way to slowing spammers (IMHO).
#48 by sola solarium on October 19th, 2007
i want juice, linkjuice.. power from your Pagerank 6. Your blog almost spams my site with backlinks if i get into your comments. But i like your content because you are an expert on giving life and personality to your blog.
I read every post word by word and post a comment spot on just to get favours and to give life in return.
You may call that spam but i have respect for great content like everyone else here. But its a fact, your blog have PR6 and links from here powers my website – according to google – so easy it has to be seen. The PR system is made that way by google and its a reality for everyone interested in SEO.
Be proud of your content and kick spammerass because i would not ever pollute your content with Viagra links.
(smart spam give life to your blog in return so its not that bad)
#49 by sola solarium on October 19th, 2007
and by the way… what the hell happend to your template??
#50 by Aufero on October 19th, 2007
The dark side of the Turing test shows itself.
Apparently programs don’t need to be able to convincingly simulate the intelligence of humans – they just to simulate the ways in which humans can be stupid.