What is CAPTCHA – the Hidden and the Fun Side of it

More Than Just an Annoying Sequence of Distorted Characters

Ever felt annoyed by having to decipher and type in the two-word phrase which is meant to prove that one is not a computer program? The CAPTCHA is being used all over the web so I'm pretty sure that everyone has an experience of it (and some even take their experience to creatively artistic level).

Here at TechUseful, we are being pampered by a nice version of “captcha”, which proves that the concept can be wrapped up in a very user-friendly form. But, there are reasons we might sincerely start to wish for the other, annoying version, because – surprise, surprise – there's a noble cause behind it. Maybe you will share some of the enthusiasm with me …

A Massive Collaboration Project

It's about digitizing books!

Many websites are using CAPTCHA to prevent bots to automatically place numerous requests or try to spam. Computer programs are not that good at reading distorted characters so at least for now it works fine. The inventor, Luis von Ahn, felt proud at first, seeing how his internet security invention became adopted by Google and then spread widely all over the web.

But, as we could expect from such an inventive mind, he took a look from the other side – the user's side. His calculation was that people type in about 200 million instances of CAPTCHA every day. 10 seconds of annoyance multiplied by 200 million means 500,000 hours of wasted time. So he started thinking hard to figure out how to use all this time wisely.

The system has a slightly changed philosophy now: one word that CAPTCHA pictures on the screen is a real scan from a text that is being digitized. Of course there's no way of knowing whether the entry will be correct or not, therefore the other word is there for authentification purposes.

A Project to Archive Human Knowledge

Little contributions lead to accomplishing a giant project

Pages of books that were created even before the computers existed, are first scanned. Optical Character Recognition (OCR) software is then used to convert the image to text. However, this process is likely to produce at least some mistakes. The words that are not processed with a necessary degree of reliability need to be checked. And here's where the reCAPTCHA jumps in: the same ambiguous image is served a few times to different individuals, just to make sure that the “human OCR recognition” results are in accordance with each other.

Luis von Ahn: Massive-scale Online Collaboration The inventor speaking and joking about CAPTCHA

An extremely entertaining explanation of what CAPTCHA does, about CaptchArt and about his next ingenious idea, the Duolingo – how to translate the Wikipedia in less time than it took Jules Verne to travel around the world.

Witty, informative, with funny instances of CAPTCHA word combinations and CaptchArt. 16 minutes of fun, creating a great sense of being part of a huge community that is doing something valuable together.