pythonocrtesseractpython-tesseractgame-automation

Improving OCR accuracy


I am trying to write a code to read a game's log and send a message to discord based on its content. So far the code works fine but I am having trouble with the OCR. Sometimes its bad accuracy causes messages to be sent more than 1 time. I need the OCR to be as exact as possible to avoid duplication of messages. I researched and find out to do a binarization/thresholding process to get black text on a white background before using trying OCR although this just worsen the results. What strategies can I use to improve its accuracy in this case? The log looks like this:
Log

The code is the following:

import cv2
import numpy as np
import pytesseract
import time
import requests
import json
import pyautogui

# Discord webhook URL
webhook_url = "https://discord.com/api/webhooks/1111838770568888401/GJeNJ9YrMgq6_Mn3XMcX2JQlvmlZsmGhNLp1sNSRpu1RxRXEwkctIUe_iRanurRz1pS8"

# Phrases to monitor and their corresponding Discord messages
target_phrases = {
    "was killed": "<@&1184910793871994920> Day{message}",
    "Tamed": "<@&1184962484675809340> Day{message}",
}

# OCR configuration
tesseract_path = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
pytesseract.pytesseract.tesseract_cmd = tesseract_path

# Time interval between OCR checks (in seconds)
check_interval = 5

# Set to store previous occurrences of the target phrases
previous_occurrences = set()

# Define the screen region where the list is located
list_region = (750, 198, 400, 400)  # Replace with the coordinates of the list region

# Function to send a message to Discord webhook
def send_discord_message(message):
    data = {"content": message}
    headers = {"Content-Type": "application/json"}
    response = requests.post(webhook_url, data=json.dumps(data), headers=headers)
    if response.status_code != 204:
        print("Failed to send Discord message:", response.text)
    else:
        print("Discord message sent successfully")

# Main loop
while True:
    # Take a screenshot of the defined screen region
    screenshot = pyautogui.screenshot(region=list_region)
    screenshot = cv2.cvtColor(np.array(screenshot), cv2.COLOR_RGB2BGR)

    # Apply OCR to extract text from the screenshot
    extracted_text = pytesseract.image_to_string(screenshot)
    print("Extracted Text:\n", extracted_text)

    # Split the extracted text into messages based on the occurrence of "Day"
    messages = extracted_text.strip().split('Day')

    # Process each message individually
    for message in messages:
        # Check if any target phrase appears in the message
        for target_phrase, discord_message in target_phrases.items():
            if target_phrase.lower() in message.lower() and message.strip() not in previous_occurrences:
                # Send a message for the new occurrence
                print(f"Phrase '{target_phrase}' detected")
                send_discord_message(discord_message.format(message=message))

                # Update the set of previous occurrences
                previous_occurrences.add(message.strip())

    # Wait for the next OCR check
    time.sleep(check_interval)

Solution

  • We can improving the accuracy of the ocr output result by below two ways.

    1. Using tessdata-best or tessdata-fast.
    2. Improving the input image quality by third party library. Most of when we increasing the resolution of image, the better will be returned.

    Please find the tessdata link below,

    refer the tessdata path by using the --tessdata-dir argument