Text Detection and Extraction Using OpenCV and OCR - Code Masala Bytes - Helping Developers Solve Real World Problems

Introduction

In the world of computer vision and automation Extracting text from images has become an important task in many applications. Whether you digitize documents Process scanned images or develop applications that rely on text recognition Leveraging powerful tools like OpenCV and Tesseract OCR can make the process efficient and effective. In this guide, you’ll find and extract text from images in Python OpenCV and Tesseract OCR. You’ll learn how to use them. This step-by-step tutorial will take you from installation to use. Helps you easily manage image and text work.

What You’ll Learn

Installing Required Packages for text detection and extraction.
Preprocessing Images using OpenCV functions like grayscale conversion and thresholding.
Applying OCR (Optical Character Recognition) using Tesseract to extract text.
Saving the Detected Text to a text file for further processing.

Step 1: Required Installations

In the beginning You need to install the required libraries and packages. Here are some quick instructions for installation.

Bash

# Install OpenCV and Tesseract
!pip install opencv-python
!pip install pytesseract
!sudo apt-get install tesseract-ocr

OpenCV is a popular library for computer vision tasks.
Python-Tesseract is a wrapper for Google’s Tesseract-OCR, a powerful OCR engine.

Don’t forget to download the Tesseract executable file. You can find the installation link here.

Step 2: Preprocessing the Image

Once you have the necessary libraries The first step is image pre-processing to improve text recognition accuracy.

Read the Image:

Python

import cv2
img = cv2.imread("sample.jpg")

2. Convert to Grayscale:

Python

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Converting the image to grayscale simplifies the detection process.

3. Apply OTSU Thresholding:

Python

ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

Thresholding helps distinguish text from the background by creating a binary image.

4. Define Structural Elements:

Python

rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))

5. Dilate the Image:

Python

dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)

Dilation enhances the blocks of text, making them easier to detect.

Step 3: Detecting Contours and Extracting Text

Now let’s find the shape (boundary) of the text block and extract the text using Tesseract.

Find Contours:

Python

contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

2. Create a Copy of the Image:

Python

im2 = img.copy()

3. Draw Rectangles Around Text Blocks and Save the Detected Text:

Python

import pytesseract

# Specify the location of Tesseract-OCR
pytesseract.pytesseract.tesseract_cmd = '/bin/tesseract'

# Create a text file to save the output
file = open("recognized.txt", "w+")
file.write("")
file.close()

# Loop through each contour to extract text
for cnt in contours:
    x, y, w, h = cv2.boundingRect(cnt)
    rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cropped = im2[y:y + h, x:x + w]
    file = open("recognized.txt", "a")
    text = pytesseract.image_to_string(cropped)
    file.write(text + "\n")
    file.close()

Step 4: Final Output and Results

When you complete the process You will have a text file containing all the recognized text from your image. This is especially useful for applications that involve scanned documents or automatic data extraction.

Additional Tips for Better Accuracy

Choose the Right Thresholding: Experiment with different thresholding methods like adaptive thresholding or OTSU’s binarization for better results.
Adjust Kernel Size: A larger kernel can group larger text blocks, while a smaller kernel can focus on individual words.
Image Quality: High-quality, high-contrast images yield better OCR accuracy. Consider enhancing the image quality before processing.

Conclusion

In this guide, you learned how to implement text recognition and extraction in Python using OpenCV and Tesseract OCR. This basic knowledge will help you build advanced projects, such as document scanning applications. automatic data entry system or machine learning applications that use text data Start experimenting with different images. and your specific needs… Tune parameters for best results.

Share This Post:

Introduction

What You’ll Learn

Step 1: Required Installations

Step 2: Preprocessing the Image

Step 3: Detecting Contours and Extracting Text

Step 4: Final Output and Results

Additional Tips for Better Accuracy

Conclusion

Leave a Reply Cancel reply