Bypass CAPTCHA on Website Using Python 3 & pytesseract OCR

The below Python 3 code-snippet allows you to bypass text/image CAPTCHA on a Website using the OCR (Optical Character Recognition) technology. To implement OCR, I’ve used the pytesseract library.

pip install pytesseract

code.py

import pytesseract
import sys
import argparse
try:
    import Image
except ImportError:
    from PIL import Image
from subprocess import check_output


def resolve(path):
    print("Resampling the Image")
    check_output(['convert', path, '-resample', '600', path])
    return pytesseract.image_to_string(Image.open(path))


if __name__=="__main__":
    argparser = argparse.ArgumentParser()
    argparser.add_argument('path', help = 'Captcha file path')
    args = argparser.parse_args()
    path = args.path
    print('Resolving Captcha')
    captcha_text = resolve(path)
    print('Extracted Text', captcha_text)

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.