How To Crack Captchas
June 5th, 2007This page will teach you how to write a not-necessarily-very-good programme to beat some common captchas, but it will not provide any useful code to do so for you. It should give you an idea how to go about defeating captchas not listed here. But mostly, I hope it will be instructive for anyone who wants to write a less easily defeated captcha in the future, since apparently you’re all hopeless at it at the moment.
As everyone in the world knows by now, most websites and forums use “captchas” to try and stop computer programmes from posting fake comments containing adverts. “Captcha” stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”. And as everyone in the world ought to have realised by now, they don’t work.
There exist a number of ways around them, the most cunning and most effective, although the most difficult to set up, is to build a pornographic website and get real humans to solve the captchas for you in exchange for naked pictures.
But mostly, they’re easy to get around because they’re shit. This, for example, is the default captcha that comes with the now obsolete phpbb2:
That is very easy to solve. (It should perhaps be pointed out at this stage that my job is in large part to extract shy information from images.) As with all the algorithms I’ll show you, this is the first and simplest one I could come up with, and it’s only the start. In all cases I will extract a binary mask of the letters for transferal to a more general OCR system. Also in all cases, I will use Matlab 6 to perform the analysis. Read the rest of this entry »
[More Help]
