This is a follow-up post to my yesterdays post that obfuscating an email address does not work and is useless. Many comments on the blog post stated that they think that obfuscation helps because the bots are not interested in the obfuscated email addresses.
So let’s recap: we use obfuscation to prevent spam bots from harvesting email addresses. So we obfuscate in a way that
- Humans are able to read the email address
- Computers are not able to read the email address
That sounds like a CAPTCHA. If you don’t know the do’s and dont’s of CAPTCHAs I recommend to read the information on captcha.net. One of the most important facts is that you shouldn’t use a CAPTCHA which will break as soon as everybody uses it. That is in the moment the bots start to support it.
Now let’s get back to the obfuscated email addresses. I think we can agree that the obfuscation is conceptually broken. I think we can compare it with cryptography: even that there is no real usecase to attack MD5, nobody would use it to digitally sign important documents any more.
As soon as the harvesters start to search for obfuscated addresses they will find them. If you obfuscate an email address on the web today and in five years the harvesters start to unobfuscate addresses they will find your address. Bad luck.
So instead of using a broken CAPTCHA like obfuscation we should use a secure CAPTCHA like the Mailhide service provided by reCAPTCHA. There are plugins for many programming languages and it can be used to e.g. automatically replace all email addresses in a Mailman archive with a link to the CAPTCHA. It looks like that: jsm…@example.com
And solving reCAPTCHAs is mostly much easier than solving the normal CAPTCHAs as you have a complete word and it is probably much easier than solving some obscure obfuscation rule and it helps to digitize books and newspapers and in the end you get a link to click on.
So I know, that you will say “reCAPTCHA belongs to Google and Google is evil. I don’t want Google to give them my email address”. If you think that, rethink. You think that the world’s biggest web harvester is unable to break your used obfuscation? You have never ever sent an email to a gmail/googlemail account? You don’t use Jabber with Google Talk users? You do not have a Google account? Do you really think that Google doesn’t already know your email address? And if you really don’t trust reCAPTCHA, you could still use scr.im to get a tiny, CAPTCHA protected URL. But I recommend to use a well tested CAPTCHA system.
To summary: I agree that you should secure your email addresses on websites. But please do yourself the favor and do it properly. Obfuscation is broken and it is only a matter of time till harvesters start to harvest the email addresses. There are services which provide a secure CAPTCHA to protect email addresses. Please use those. And no this is not an advertisement campaign for reCAPTCHA – it is just the best CAPTCHA service I know. If you know a better and more secure which doesn’t belong to Google, please leave a note 🙂
Hi Martin!
I think your point is correct, but is worth it to take the time to change what we already have?
> Obfuscation is broken and it is only a matter of time till harvesters start to harvest the email addresses
It it, but how long is that? I made the point last time about low hanging fruit, and I submit that it is enough for me that my fruit hangs higher. In my case, I fully realise that spammers could at any moment start executing JavaScript and evaluating the results. How long is it until they do? I like my solution, as the end user gets an un-obfuscated clickable ‘mailto’ link, whereas a (non-JS) bot doesn’t get anything that even remotely resembles an email address.
For anyone still using “foo at bar dot com” style obfuscation, I would say it is worth the time to change what you have for the following reason: /(\w+)\s+at\s+((\w+)(\s+dot\s+(\w+))+)/ — and that’s only a very simple one off the top of my head. Your fruit hangs low! 🙂
Steve
Obfuscation is no Captcha. Obfuscations/Crypting and CAPTCHAs both belong to the family of Challenge-Response Tests. I still don’t understand the point here, though. I have places where it’s perfectly fine to use obfuscation (which definitely does work) and where it would be quite out of place to use captchas. On the other hand I use reCAPTCHA frequently (on all comment forms, registrations, contact formulars, etc. pp.).
you just proved with your arguments that the captcha discussion is useless anyway, since there are already a information monopols by google and other huge crawlers/service providers and all you can do is *nothing* about your mail address getting caught by them.
you can only hope that you cannot be made responsible for your mails by the state, which is cooperating with google in all countries. and hope they won’t do what they promise at the moment: share more information between different parties to improve crime detection.
bad luck, preventing porn ads in your mail account should be your least problem actually.
p.s. you can create a lot of free public e-mail accounts of course, but this won’t safe you from their intensive scanning either. now set up your mail server and create accounts as you wish, but hey they know your domain and still will track you easily… maybe speek l33t?, no it is easy to do regexp transformations on strings today… so do it in tor or minion… now c’mon you paranoids, this is a political issue, which logically cannot be solved technically, but only politically. fight against surveillance or rest in quietness forever.
In Germany we have some problems with law which requires you to lay open your contact information on your web site. Of course it should the contact information should also be barrier-free…
The purpose of “obfuscaption protection” is just to reduce a likelihood of email harvesting. Of course you can quickly write an attack script that grabs an email address. Otherwise you simply hand out the mail address to spammers.
Stop being so paranoid. If your mail is harvested so what? You’ll get more spam in your spam folder, that’s all. I think the only way to stop spam is counterattacking with DDOS. Then the ISP of those machines acting as zombies for massive nets are going to take note and do something.
@Sebastián Benítez: “I think the only way to stop spam is counterattacking with DDOS.”
DDOS is worse than SPAM. By definition, it eats lots of bandwidth. Besides of which, it’s illegal.
I’m sorry but your response in regard to Google is nonsense. By solving the reCaptcha you provide Google with free labor (aka unpaid work). This labor directly contributes to Google’s value e.g. as it makes their offering of scanned books better than that of any competitors. Google is extremely good at making people work for them for free. And this is the reason why you shouldn’t touch any of their products.
@Goo: if you don’t need digitized books: fine. I need them and I want that they get digitzed and if that can be done in an outomated process even better. And btw. reCAPTCHA did not start as a Google service, that is a very recent development.