Jeff Yan and Ahmad Salah El Ahmad
Abstract. CAPTCHA is now almost a standard security technology. The most widely used CAPTCHAs rely on the sophisticated distortion of text images rendering them unrecognisable to the state of the art of pattern recognition techniques, and these text-based schemes have found widespread applications in commercial websites. The state of the art of CAPTCHA design suggests that such text-based schemes should rely on segmentation resistance to provide security guarantee, as individual character recognition after segmentation can be solved with a high success rate by standard methods such as neural networks. In this paper, we analyse the security of a text-based CAPTCHA designed by Microsoft and deployed for years at many of their online services including Hotmail, MSN and Windows Live. This scheme was designed to be segmentation-resistant, and it has been well studied and tuned by its designers over the years. However, our simple attack has achieved a segmentation success rate of higher than 90% against this scheme. It took ~80 ms for our attack to completely segment a challenge on a desktop computer with a 1.86 GHz Intel Core 2 CPU and 2 GB RAM. As a result, we estimate that this Microsoft scheme can be broken with an overall (segmentation and then recognition) success rate of more than 60%. On the contrary, its design goal was that "automatic scripts should not be more successful than 1 in 10,000" attempts (i.e. a success rate of 0.01%). For the first time, we show that a CAPTCHA that is carefully designed to be segmentation-resistant is vulnerable to novel but simple attacks. Our results show that it is not a trivial task to design a CAPTCHA scheme that is both usable and robust.
Draft research paper [PDF]
ACM CCS'08 version [PDF]
Frequently Asked Questions
Q. Who was responsible for this research?
A. This project is joint work by Jeff Yan and Ahmad Salah El Ahmad, both at the School
of Computing Science, Newcastle University, England.
Q. Are your programs or source code available?
A. Due to the sensitive nature of this research, we have not released programs or source code at this time.
Q. Have you notified Microsoft about these vulnerabilities? How did they respond?
A.
We notified Microsoft the weakness of their CAPTCHA in Sept, 2007.
As requested by them, our paper was held confidential until now (10 April, 2008).
Some feedbacks from Microsoft on our work:
"...in an effort to show our appreciation for the hard work that you and your colleagues perform in helping us keep our online services products and customers safe, Microsoft has developed a New Security Researchers Acknowledgement Website for Microsoft Online Services.
The new website formally acknowledges Security Researchers that responsibly submit Security vulnerabilities found within Microsoft Online Services Products and applications. Because you responsibly submitted this case to us, we would like your permission to place your name, company, or alias on our new site."
Q. Who else was aware of your attack last year?
A. Luis von Ahn was briefed on our attacks
and results when Jeff was visiting him at CMU in Oct, 2007.
Q. Are there other CAPTCHAs vulnerable to your attack?
A. The CAPTCHA deployed at Yahoo until March 5, 2008 was
vulnerable to a variant of our attack.
Q. How is this different from other attacks, if any?
A.
It's reported in
Automated Automated crack for Windows Live captcha goes wild (The Register, Feb 8, 2008) that
a surge of spam being sent from Windows Live accounts was observed, and a bot was analysed by a
security firm to understand what was behind this phenomenon. However, in this reported case,
the captcha decoding was not done by the bot, but at a remote server. It's unclear whether
there was cheap human labor behind the scene feeding captcha answers manually. On the other
hand, even if an automated attack was
launched by the server, to date, no technical detail of this attack has been revealed at all.
Q. Have you broken other CAPTCHAs? Have you discovered other interesting attacks?
A. We have also broken the latest scheme that Yahoo has
deployed at its global web sites since March, 2008.
Our attacks on Yahoo CAPTCHAs are discussed in a recent manuscript, entitled
"Is cheap labour behind the scene? - Low-cost automated attacks on Yahoo CAPTCHAs".
The manuscript is not released yet (an abstract is here), but
one copy was already sent to Yahoo.
Q. Why do you do all this?
A.
We believe that CAPTCHA will go through the same process of evolutionary development
as cryptography, digital watermarking and the like, with an iterative process in which
successful attacks lead to the development of more robust systems.
Q. Who is funding this research?
A.
This is part of our ongoing project: Secure and usable CAPTCHAs.
Ahmad Salah El Ahmad is supported by a prestigious Overseas Research
Students (ORS) Award and scholarships from both our school and university this year.
We are looking for funding to support Ahmad's PhD study on CAPTCHA,
a young but important
topic, in the coming years. Please contact us if you would like to offer support/help.
More papers generated in this
project are in the pipeline.