Breaking the PayPal.com CAPTCHA

Posted by Kurt on May 12th, 2008

The PayPal.com CAPTCHA suffers several weaknesses: fixed font face, fixed font size, no distortions, trivial background noise, and it’s easy to segment. In this experiment, a three step algorithm has been developed to break the PayPal CAPTCHA. The image is preprocessed to remove noise using thresholding and a simple cleaning technique, and then segmented using vertical projections and candidate split positions. Four classification methods have been implemented: pixel counting, vertical projections, horizontal projections and template correlations. The system was trained on a sample of twenty PayPal CAPTCHAs to create thirty-six training templates (one for each character: 0-9 and A-Z). A separate sample of 100 PayPal CAPTCHAs were used for testing. The following success rates have been achieved using the different classifiers: 8% pixel counting, vertical projections 97%, horizontal projections 100%, template correlations 100%. Three of the trained classifiers out perform the 88% success rate of Pwntcha.

Example

Preprocess

  1. Original:
  2. Grey Scale:
  3. Thresholding:
  4. Further Cleaning:

Segment

  1. Segmented:
  2. Padded:

Classify

  • Pixel Counting: 8% Break Rate
  • Vertical Projections: 97% Break Rate
  • Horizontal Projections: 100% Break Rate
  • Template Correlations: 100% Break Rate

Paper

The final paper including MATLAB source code, sample runs, and results can be downloaded here or from the RIT Digital Media Library.

Presentation

A copy of the slides used for a presentation of this experiment can be downloaded here.

Data

The 20 training and 100 testing PayPal CAPTCHA images are available to download here.

Source Code

Complete MATLAB code (281 lines, well commented) for preprocessing, segmenting, and classifying the images is available here.

YouTube Video

Note that this video wasn’t created by me. Skip forward to approximately the 1 minute mark.

Breaking the ASP Security Image Generator

Posted by Kurt on February 28th, 2008

For my independent study, I investigated optical character recognition techniques and their application to recognizing text-based HIPs (methods used to distinguish human users and machines on the internet). This study is an extension of methods covered in neural networks and machine learning, computer vision, and artificial intelligence. The report includes experimental results of breaking the ASP Security Image Generator (CAPTCHA) v2.0 with a 72% success rate. Posting of source code is not currently planned. However, my paper contains fairly detailed steps and can be downloaded here.

Joined the Document and Pattern Recognition Lab

Posted by Kurt on October 1st, 2007

For my Master’s thesis, I’ve decided to work in the Department of Computer Science’s new Document and Pattern Recognition Lab (DPRL) lab under the advisement of Dr. Richard Zanibbi. My area of research will be Human Interactive Proofs / CAPTCHAs.

Automating Human Verification

Posted by Kurt on February 8th, 2007

This is one of my first papers on CAPTCHAs which I wrote for my Privacy and Security course taught by Warren R. Carithers. The survey paper can be downloaded here.


Modified version of Webby Blue
Copyright © 2008 kloover.com. All rights reserved.
**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**