Broken
Toys
Random comments about
games and tractors
We’re Losing The Spam War
This may not be the most popular blog on the Interwebs, but it’s one of the oldest. Thanks to that dubious distinction, I seem to be inflicted regularly with the leading edge of spammer technology.
It started with comment spam: someone posting links upon links upon links, usually of porn sites or ads for erectile dysfunction drugs or similarly snake-oily schemes. I’m not sure if there’s a person on the planet who would actually see a post like this and remove money from their wallet somehow, but someone must, because I still get those.

Then we have the next step up in spammer intelligence – instead of links, just page after page of search terms. I’m fairly certain this is used to game Google to present useless results for totally unrelated searches, but I’m not really sure how it works. Still, it must, because I still get those too.

Still, these are fairly easy for blog programs to detect and suppress. Which is a good thing, because I get about 400 of these a day. Four. Hundred. A day. And that’s on a slow day. Sometimes it goes up to four thousand. During that same time, since this is a relative backwater of the interwebs, I get maybe …ten? legitimate actual comments.
So, the spammers decided to escalate a bit, and try to conceal their payloads as those ten legitimate comments.

Now blog programs can still make an effort to detect these. The spambots used to post these usually use the same fractured English fragments, and of course they still have to include a link to whatever useless crap they’re trying to sell. So there’s still some hope of combating this, right?
Gentlemen and ladies, I present to you…. smartspam.

This is in response to the recent Youtube I posted of Sara Brightman singing about her disco love for starship troopers. The comment is actually on the topic posted. The best I can guess, someone is actually paying people (probably through a pay-per-click scheme) to leave somewhat relevant comments with the spam payload.
The result: Bayesian spam prevention, the best hope so far of limiting spam, is broken. Trying to filter this spam via the comment payload now runs a significant risk of blocking legitimate commenters. (Which is already happening – there’s one regular commenter whose comments are regularly blocked by the spam filter now.) You could try to block on the link payload… but that changes, and doesn’t use obvious key words any more.
Hell, in the past week or so I’ve gotten comments that I can’t even tell are spam or not.
We’re losing the war. And eventually the Internet will be nothing but bots selling Cialis Jessica Alba plane ticket porn to each other.
This, then, is the bright new future of new media.

| Print article |
about 2 years ago
I refuse to believe that people give -that much of a shit- about Paris Hilton.
about 2 years ago
So their goal is to just get their URL in front of us? I don’t get it. I’ve never even thought about touching anyones links on their comments here or anywhere.
I’m just not that bored.
about 2 years ago
When the first truly artificial intellegence arrives, it will be a spambot…..
God help us all…..
about 2 years ago
lum- you’ll have to turn on captchas for the comments to protect yourself. As for your email, if you have control over your own server (or even if it’s hosted with someone like ensim), you can install dspam on there and teach it to filter out the spam before it hits your throughput twice.
about 2 years ago
Eventually, we’ll all have to go back to walled fortresses.
Even the smallest blog that doesn’t want to turn into a cesspool of spam shitnibblets will have to be password-protected and invite-only. I know that sounds repugnant to everybody that had this utopian dream for the Internet and information sharing, but I don’t see another way.
about 2 years ago
It seems the best temporary approach is just to axe the website field. That at least saves you the pain of determining if it is spam or not – the fluffy warm spam comments can be accepted as fact and make your day brighter rather than serving to warn of the oncoming apocalypse.
about 2 years ago
The problem is that spammers have realized that they can circumvent any spam protections by hiring cheap human labor in Asia to post their spam for them (I know this because once in a while one gets through my forum captcha and email verification). So even captcha becomes useless after a while. And some bots can read captcha too.
The goal is traffic. They want you to go to their site to
A. Sell you something
B. Infect your computer so they can turn it into a zombie that sends spam
As far as Paris Hilton… it’s a holdover from when the top search term on the net was that video of her and her lover. I doubt spammers care that much about her either.
about 2 years ago
That is frickin’ ridiculous.
Brask makes a good point though, no website field and your spam suddenly becomes inspiration to write more.
Spam won’t stop until people get smarter. It only takes one in a million people to actually fall for the trap for it to be successful.
about 2 years ago
Whats even sadder is that spam is moving away from “please click me” to “here is an advertisement” right in front of your face. Blogger users are starting to see widespread comment spam that has nothing to do with links and really tries hard to sneak advertising into legitimate looking comments. I manually delete a dozen or so comments a day that are just amazingly offbase, but contain absolutely no links. And if Lum is the backwater, I am just a cinder block someone threw in the pond.
about 2 years ago
Our next layer of defense will be when the Internet generation starts becoming sexually impotent in enough numbers to exert some pressure on Cialis.
about 2 years ago
Sorry but I really feel this post about spam is sensationalism.
In all aspects of any society, it is a given that there are enough people so that all avenues of revenue will be tried, all doors will be opened, and all means of gaining an advantage over the other individual will be taken.
Just assume that the other 4294967295 IP addresses are out to get you. Turn on your capatcha, lock your front door, and cancel your credit card when you loose your wallet.
The internet is not a utopia.
(FYI: Capatcha’s are defeatable, but they require more CPU and development time => more expensive to spam)
about 2 years ago
Having trouble keeping your spam filter up? Try Cialis!
… Sorry, I had to.
about 2 years ago
Nuke them from orbit.
It’s the only way to be sure.
about 2 years ago
> Sorry but I really feel this post about spam is
> sensationalism.
Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.
Another fun statistic: 90% of all email on the internet is spam. The basic, core internet service largely consists now of bots fighting each other over whether or not it’s the 10% of the email that makes it into your in-box, simply because it’s “sensationalism” that we should do something about email being subverted wholesale by sleaze peddlers.
It’s not sensationalism; it’s why comments here and elsewhere will shortly go invite-only, simply because all the tools you glibly recommend we use to “lock our doors” are broken.
about 2 years ago
As best I recall, Google bases some of their search algorithm on the number of links to your page. The more links you have, the higher up your listing appears. So the need isn’t for the person looking through the comments to click on the link. It just needs to be there for the Google bots to pick up.
about 2 years ago
I don’t allow comments on my blog for just this reason. IMO, a blog is not a discussion forum. My blog is a way for me to tell -all- of my friends about relevant issues and news about me without having to tell all of them individually. It is, by its very nature, a one-to-many communication. If a specific reader has a comment, they are free to email me personally. I’m much more likely to respond to that.
I just dont feel that blogging should be a community-building activity.
If a blogger is looking to have feedback on an article, they should invest in a web-based bulleting board. Simply set it up so that the blogger/admin is the only account that can create new threads and you’ve just mimiced the typical blog+comment mechanic.
I’m not sure that would be any better WRT spam, but I think that’s my point.
about 2 years ago
Stupid commented:
Comments on blogs are apparantly for irony.
about 2 years ago
I’ve had good results from just using akismet (wordpress’s spam-filter). The stuff that gets through is the weird maybe-spam stuff — the “This is great page” style comments on entries that don’t make sense, without any obvious links to products.
Invitation-only sounds too stifling to be plausible. I’d expect to see more in the way of smart manipulation of comment forms — fields named “url” that are hidden with CSS and cause the comment to be rejected if they’re filled in. That sort of thing.
about 2 years ago
“Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.”
Yes, but this is exactly how you defeat spam. You don’t beat it by making unbreakable filters, because there is no such thing and never will be. You break it by raising the price. Even with broken captchas, the time and effort goes up a fraction. That fraction increases the cost and lowers the profitability by a small margin. Every time we can slice off a little more of that margin, we make it less worthwhile for spammers, which is the only way to kill spam.
The only other way to stop spam dead would be real honest-to-god smart AI filters. But considering that it’s the spambots that are more vigorous, if anyone is to accidently develope the first spontanious AI it’s more likely to be born from the spambots themselves.
Self-aware spambots. Now there’s a creepy thought for the day.
about 2 years ago
“Captchas have been broken, both through trivial image recognition and through simply paying someone 1/4 cent to type them in.”
Yes, but this is exactly how you defeat spam. You don’t beat it by making unbreakable filters, because there is no such thing and never will be. You break it by raising the price. Even with broken captchas, the time and effort goes up a fraction. That fraction increases the cost and lowers the profitability by a small margin. Every time we can slice off a little more of that margin, we make it less worthwhile for spammers, which is the only way to kill spam.
The only other way to stop spam dead would be real honest-to-god smart AI filters. But considering that it’s the spambots that are more vigorous, if anyone is to accidently develope the first spontanious AI it’s more likely to be born from the spambots themselves.
Self-aware spambots. Now there’s a creepy thought for the day.
about 2 years ago
On august 18th, 12:47AM, SpamBot became self aware…
about 2 years ago
“Invitation-only sounds too stifling to be plausible.”
Agreed. When you get down to it, though, it’s URLs that are the problem for comment spam. One simple solution that comes to mind is to automatically screen any comments with URLs (and comments that fill in the “website” box). Then when you unscreen, you also have the option to add the URL domain to a list of acceptable domains that aren’t screened in the future (thus quickly stopping the screening from your regular posters’ own URL linkbacks as well as youtube and the like). That would help, no?
about 2 years ago
Akismet + Bad Behavior works like a charm, for now…
about 2 years ago
There’s an article on /. that talks about the email spam problem. According to the article, over 95% of all email is now spam, and the past few months, some sysadmins have seen 98%+ of all emails being spam.
It’s an unfortunate problem with no easy solutions, people being only human and all.
about 2 years ago
In the old days, people had to be irritating jerks in person, selling Amway door-to-door…now, thanks to the miracle of computerization and the Internet, people can be irritating jerks through automation.
I, too, am amazed that there are enough people out there who actually buy the crap that spammers sell, enough to make this whole insane mess profitable…
about 2 years ago
“It seems the best temporary approach is just to axe the website field.”
I’d agree if doing that would stop someone from posting spam, but we know they’ll put it in the body of the comment if there isn’t a field for it. What I think might work better is to keep that field in there, but state in plain english that filling in that field will send your comment to the bitbucket.
about 2 years ago
Fuck captchas! They suck. I cannot tell you the number of times I’ve actually failed to prove via a captcha that I am an actual human being . . .
No I think requiring OpenId is the right approach, & then having shared whitelists among communities of bloggers for known good people.
TK
about 2 years ago
And yet despite all this my mailbox is still filled with 80% junk mail. Sometimes it doesn’t even make it through the slot because the mailman just took the wad and shoved it in, getting it stuck. So I go home, grab the wad out of the mail slot, walk in, throw 90% of it away, log in to my computer and emtpy all of my bulk mail folders (typical day: 3 new messages in inbox, 95 in bulk mail folder on Gmail, and 2 of my inbox messages are still spam)
I’m all for requiring logins for comments, though looking on a couple message boards even that doesn’t work all _that_ well.
about 2 years ago
I get junk mail for the previous 5 or 6 residents of my apartment.
The driver of my shuttle bus likes to play the local CBS radio news affiliate, which from what I gather is 80% ads and 20% sponsor-supported content. With mp3 and CD players these days, why does anybody even own a radio?
Gmail does a great job of filtering spam, maybe one or two get through per month – out of thousands.
However, if spammers are forced to use real people to get through automated defenses, the volume of spam will be cut dramatically (maybe by 99%). They would have to hire half the world population and have them work 18 hour shifts to generate comment spam at even a fraction of the rate of a bot (and all those computers running at the same time will bring on the premature cold death of the universe).
about 2 years ago
This all just fuels my conviction that 99% of all advertising – spam, junk mail, TV and radio ads – is wasted. Still, that 1% must be spending a metric butt-ton of money to keep the whole insane mess rolling along…
about 2 years ago
>It’s not sensationalism; its why comments here >and elsewhere will shortly go invite-only, simply >because all the tools you glibly recommend we >use to “lock our doors” are broken.
Not broken, but not complete. They do their part.
I recommend this experiment. Create a vBulletin forum with sample trash data. Link to it at various places but don’t bother having people post. Just fill it up with trash.
- Allow guest posting and see how much spam you get.
- Turn on registration only but allow anyone to register, and see how much spam you get.
- Turn on your CAPATCHA for registration, repeat.
No less than three months ago I did this scenario and after 1 week of being live (with a new URL) I got:
- Registration on, no CAPATCHA: 1 or 2 spam posts a day.
- Registration on, CAPATCHA required: 1 spam post a month.
Inherit protocols with the internet allow spam, much like the game mechanics of MMOs will allow gold farmers. Equally comparable is that , regulation and prohibition won’t combat either.
“Something will work to block spam until the spammers defeat it.”
Well duh. Just like in the encryption world, any particular method is only as good until it breaks! (example: AES encryption is only as good as long as there are only deterministic computers)
1) Create a counter -> counter gets countered.
2) Declare the internet to be broken!
3) ???
4) Profit!
about 2 years ago
* Inherit protocols with the internet allow spam. The fault is underlying principal that spamming generates more money than it costs. The same thing is comparable tothe game mechanics of MMOs which have gold farmers.
^^^ correct paragraph ^^^
I understand that you’re pissed about spam but dammit you could have done something to combat it in the time you made your post. Your solution (whatever it may be) probably would have combatted all spam for the rest of the blog.
I also assume that you responded to my comment because I said the dirty S (enationalism) word. You don’t strike me as a sensationist person, but really, reflect on your post and consider why you titled it “We’re Losing The Spam War,” aside from being snide (an aspect of your blog which I enjoy btw).
about 2 years ago
I understand that you’re pissed about spam but dammit you could have done something to combat it in the time you made your post. Your solution (whatever it may be) probably would have combatted all spam for the rest of the blog.
Actually, no, since the blog’s remotely hosted at wordpress.com now, I’m pretty much forced to choose between (a) forced logins which will eventually be broken anyway I’m sure, and (b) relying on the default spam filter which is currently breaking down completely.
about 2 years ago
Pivot blocks spam. http://www.pivotlog.net/
Seriously.
about 2 years ago
Whitelists work right up until the point when someone pays enough money to get on them.
I host my wordpress blog on Dreamhost, BadBehavior/Akismet has keep all but 6 posts from landing in the 2 years since I blitzed Movable Type.
The ones that got through? They came from registered and CONFIRMED users. My only solution (which likely wouldn’t work at all for you) was to blitz the user list and go to double-moderated users only. I have to approve your user before you can comment and approve your first comment. After that, you can post at will.
It’s ridiculous and the result is that no new comments have been posted since as the folks that would have posted something just sent me an email instead.
about 2 years ago
Also: I don’t know how much theme flexibility you get with a hosted wp blog, but a few judicious theme edits would remove users mail/url without much trouble. You just need to pass a couple optional parms to the php call that drops user info – it’s documented pretty well in the WP.API codex.
about 2 years ago
Im not sure why you publish my comment on your massive blog as spam with BOTH my work email and my ip-number.. but it doesent feel right.
Did i do something wrong here?
about 2 years ago
I guess that would prove my point that I can’t tell when something is or isn’t spam!
about 2 years ago
how many hits do you get a day?
about 2 years ago
It’s important to note that email spam and comment/forum spam have two distinct goals. Email spam is like a roman candle. You send out millions of emails and if it turns out only 0.0001% responded you send out millions more the next day. Who cares if everyone else deleted it? It’s cheap and easy.
Comment/post spam has a different goal. In this crazy place we call the Internet, most people use search engines to get around. Search engines use, as part of their algorithm, links. The more often they find a link to your site, the more likely you are to rank well when people do go looking for whatever it is you’re selling. Unlike email spam, where the goal is action by the end user, comment/post spam’s goal is mostly search engines because, somewhere, someone has set up a blog/forum/look-at-me page and isn’t savvy or concerned enough to delete that link to that less-than-legit site. They don’t want your clicks (but they’ll take them), they want the spiders to notice them. Get enough links that way and you too can win the biggest MMO of them all: The Internet.
about 2 years ago
It’s not even about advertising or that one in a million idiot who ponies up. It’s the same thing as covering stop signs with opaque black paint. It’s simple vandalism of the oldest kind — “look what I can do to you and you CAN’T STOP ME.” The spammers don’t care a whit about getting responses, they amuse themselves getting anonymous control over you. Even before you opened your mailbox, they’ve already received their reward, their payoff, knowing they’ll waste half an hour of your time and force you to wall yourself in with filters and white lists. They entered your electronic house and stole the only thing you had of value — your privacy. I recently started working at a company that has the best security in the business and I’ve never posted/published my company-issued alias anywhere — yet I still get spam there.
about 2 years ago
TPRJones – Actually that’s not true. Because the spammers ain’t paying. The botnets they use mean that other people are.
about 2 years ago
I used to have a real job many years ago where we sent junk mail to potential donors. (I did fundraising for a local theater by day, production by night.) As the lowest level member of the fundraising team, I got stuck with direct mail. Yes, even the people who inflict it hate it, and prefer cozy little auctions and snuggle time with known donors.
We bought or swapped mailing lists with local orchestras, art museums, etc. And if we couldn’t arrange a purchase or a swap, my job was to go snag a program and copy, BY HAND, the list of major donors. And cross reference with our own major donors since they tended to be the same people. And use the phone book to match addresses with the names of the handful that didn’t cross reference. I did have a very early computer at this job, so I could cross reference without too much pain, and at the end I could auto-generate the address labels.
All that effort was towards the holy grail of 2% returning my postage paid envelope with a check enclosed. The month I got 3% back I was taken to dinner and celebrated as a god.
All that work for 2%. If all I had to do was click a few buttons to send my bot on its way to get 1% return? It would have been a MASSIVE profit to my company. Shrimp nets are faster than targeted hook and line setups. Because of this, spam is not going to be stopped by any of the means bandied about here except invite only filters. The things some of you suggest would have to increased by many orders of magnitude to affect the margin I explained above.
about 2 years ago
Captcha’s are only sucessful until you’ve been targetted by the spammers. As others have mentioned, they’re defeatable by image readers, low cost labor over seas and one that I don’t think anybody mentioned – Porn click through schemes.
Plus, if you have a site large enough you have to provide a captcha with a version for the visually impaired, which is way easier to break than the image version.
The other thing you guys are missing – for most spammers there is almost 0 cost to doing these things. All the spam is coming from infected zombie systems.
about 2 years ago
Well, if the first AI existing is going to be a spambot… then the only way to battle it will be the creation of an AI spambot slayer!
And then, when they battle it out and eventually the anti-spam-bot AI wins and takes over the Matrix, at least we’ll be add free?
about 2 years ago
The best way I found is to moderate comments, so the comments won’t be seen by others till the blog owner allows it.
Another interesting thing I observed on one of my tiny traffic forums. Initially anyone can post. Soon I noticed spams for porn sites and some very strange links. So I set it to only allow posting from registered users. After a few days spams apear again, and the speed of spam increase graduately surpass my patience to delete them. I then set it to only active a user after I approves it (Yes, the spam bots pass the images etc easily), that did the trick. Even thought I have to mass approve the users every night (most of them with strange names so might be spamers), there is no spam on my site any more.
That only tells me one thing. If you time delay user activation, the spamers won’t be able to get you. I think the reason is the spamers apply for email addresses and are forced to give it up after certain time (maybe after they got reported and email account baned).
about 2 years ago
From L’Emmerdeur:
why does anybody even own a radio?
For NPR and the local college radio station, of course.
As far as stopping the blog spam? Keeping the comment pages in your robot.txt should stop the more intelligent spammers. Though there is no promise of intelligence when it comes to bots. Why waste even $0.00001 on a comment that will NEVER get indexed? This and forbiding html links would go a long way to slowing spammers (IMHO).
about 2 years ago
i want juice, linkjuice.. power from your Pagerank 6. Your blog almost spams my site with backlinks if i get into your comments. But i like your content because you are an expert on giving life and personality to your blog.
I read every post word by word and post a comment spot on just to get favours and to give life in return.
You may call that spam but i have respect for great content like everyone else here. But its a fact, your blog have PR6 and links from here powers my website – according to google – so easy it has to be seen. The PR system is made that way by google and its a reality for everyone interested in SEO.
Be proud of your content and kick spammerass because i would not ever pollute your content with Viagra links.
(smart spam give life to your blog in return so its not that bad)
about 2 years ago
and by the way… what the hell happend to your template??
about 2 years ago
The dark side of the Turing test shows itself.
Apparently programs don’t need to be able to convincingly simulate the intelligence of humans – they just to simulate the ways in which humans can be stupid.
about 2 years ago
So you can go to jail or be fined for buying drugs or stolen goods off someone.
How long will it be before you can be sent to jail or fined for buying something via a spam link?
about 2 years ago
Don’t forget to let me inside when you lock up the door for good, Scott! You know my real email.
I had massive spam problems on a forum. One for my gaming guild, in fact. Captchas didn’t slow them down (admittedly my software used fairly lame captchas) and admin approval, sorting one new user from a hundred new spammers, was getting to be a chore. Before I switched to another means — which I won’t detail because if too many people start doing it, it’ll be useless — I used the URL field as a test. I said prominently in the signup form that if you entered any URL, you would not be approved. Then I could just look over the new users and approve only the ones who followed the instructions — i.e., didn’t act like bots. It nailed 100% of the bots, and only got one or two false positives from lamers who can’t follow instructions; it was just too labor-intensive for the long run.
I do find it ironic that you are using WordPress despite your stated dislike of link spam and Google results manipulation (something which, of course, harms all of us who want valid search results). Have you so quickly forgotten about PhotoMatt and his defense of how it’s not wrong for him to make a few bucks by polluting search results and degrading the utility of search for everyone else? And his blog full of glowing “approval” because he deleted any dissenting comments? I won’t use WordPress for that reason alone. Even for something free, sometimes the price can be too high.
If you’ve got a way to kill comments based on whether or not they entered a URL, the idea of hiding the URL field with CSS would be a good stopgap measure.
By the way, the real target of this spam isn’t exactly your users. They’re not where the money comes from. The goal is to get people to land on a page well-populated with Google’s (or any competitor’s) targeted ads, especially if they’re pay-per-impression ads. That’s why the search companies don’t make the effort they could to put a stop to it. They’re making money off this too, as their advertisers have to spend more on advertising to actually get it in front of the eyeballs they want to reach.
about 2 years ago
its so easy getting rid of all spam.
1. dont allow links (name becomes link to website info)
2. do not allow html in posts
by doing this there will be no seo trix because we cant get backlinks, see how easy it is?
about 2 years ago
I think you’re a spammer, sola solarium.
about 2 years ago
The blog software that this site uses is WordPress, which has had comments invisible to search engines for at least the last several versions (2.0, perhaps earlier). There’s no need to make a robots.txt alteration. Once other blogging software follows suit the link spam comments — the ones trying to increase web presence — should stop.
If you’re not concerned about getting a high search engine rating, use the WordPress option to make your entire blog invisible to Google, Technorati, etc. Let people find you by word of mouth, white lists, and so on. If you don’t show up on search engines, the spammers can’t find you and make you a target.
I think many commenters are missing the point. Akismet, et al, filter spam just fine. The problem is if you want to look through the comments marked as spam every day to see if there was a mistake. (It happens.) For a lot of bloggers, myself included, that’s too much work. It takes the fun out of blogging.
about 2 years ago
Peer moderation + optional logins is way to combat this. Just follow slashdot model – they have ZERO spam issues.
about 2 years ago
Google needs to go to a non link based ranking ontology. I have no idea how the fuck you’d do such a thing, but they’re smart and work at google, they should be able to figure it out.
24% of Americans think Bush is doing a good job, after that 1% of Americans actually shopping via spam no longer surprises me.
about 2 years ago
Ultimately, the internet will end up talking to itself, about nothing. Aliens will find this in their search for intelligent life elsewhere in the universe, and decide there is none here, which will save our planet from colonization by bug-eyed beasties from outer space.
Either that or it’s all a commercial plot to close down the internet so a “totally secure, commercial environment” can be plonked down in its stead, and everyone can be required to pay $100/300/month just to have access, before they can pay money for anything else on it.
That’ll be right before western civilization comes to a complete halt and the cockroaches take over.
about 2 years ago
the krogpersonal and sola solarium’s above are spam commenters. Sola Solarium explained why he/she/it benefits from posting here due to the link back from a high page ranking site. But it’s just spam per se, with a more personal touch I guess. Both sites are in Swedish (I think).
I like the articles you post about these issues because it’s both alarming and interesting at the same time (to me anyway). Is this all the internet is destined to become, a cesspit of useless spam? And I’m firmly for net neutrality but some new systems/tools need to be made to combat this.
about 2 years ago
I guess you just need to start blocking any kind of links at all.
Of course, the time spent on such efforts is part of why I still don’t blog myself!
about 2 years ago
Argh; did your spam post earn you a whole slurry of spam trackbacks too, Scott?