Merged testing.

2025-08-22 00:31:32 -04:00
parent a044bfaade 1fed5922c3
commit ad6de50f9b
8 changed files with 728 additions and 2 deletions
--- a/src/cthulhu/cthulhuVersion.py
+++ b/src/cthulhu/cthulhuVersion.py
@@ -23,5 +23,5 @@
 # Fork of Orca Screen Reader (GNOME)
 # Original source: https://gitlab.gnome.org/GNOME/orca
-version = "2025.08.19"
+version = "2025.08.22"
 codeName = "master"
--- a/src/cthulhu/plugins/OCR/README.md
+++ b/src/cthulhu/plugins/OCR/README.md
@@ -0,0 +1,210 @@
 # OCR Plugin for Cthulhu Screen Reader
 A powerful OCR (Optical Character Recognition) plugin that enables Cthulhu users to extract text from visual content including windows, desktop areas, and clipboard images. Originally based on the ocrdesktop project by Chrys, this plugin integrates seamlessly with Cthulhu's accessibility framework.
 ## Features
 - **Window OCR**: Extract text from the currently active window
 - **Desktop OCR**: Extract text from the entire desktop screen
 - **Clipboard OCR**: Extract text from images copied to the clipboard
 - **Voice Announcements**: Clear audio feedback about OCR operations
 - **Multi-threading**: Non-blocking OCR processing with progress tracking
 - **Text Cleanup**: Automatic post-processing to improve OCR text quality
 ## Keybindings
 | Key Combination | Action | Description |
 |----------------|--------|-------------|
 | `Cthulhu+Control+W` | OCR Active Window | Performs OCR on the currently focused window |
 | `Cthulhu+Control+D` | OCR Desktop | Performs OCR on the entire desktop screen |
 | `Cthulhu+Control+Shift+C` | OCR Clipboard | Performs OCR on image data from clipboard |
 ## Dependencies
 ### Required Dependencies
 - **python3-pillow** (PIL) - Image processing library
 - **python-pytesseract** - Python wrapper for Tesseract OCR
 - **tesseract** - OCR engine (with language packs)
 - **GTK3/GDK/Wnck** - For screenshot capture (usually pre-installed)
 ### Installation Commands
 #### Arch Linux
 ```bash
 sudo pacman -S python-pillow python-pytesseract tesseract tesseract-data-eng
 ```
 #### Ubuntu/Debian
 ```bash
 sudo apt install python3-pil python3-pytesseract tesseract-ocr tesseract-ocr-eng
 ```
 #### Fedora
 ```bash
 sudo dnf install python3-pillow python3-pytesseract tesseract tesseract-langpack-eng
 ```
 ### Additional Language Support
 To add support for other languages, install additional Tesseract language packs:
 ```bash
 # Examples for different distributions:
 # Arch: sudo pacman -S tesseract-data-fra tesseract-data-deu tesseract-data-spa
 # Ubuntu: sudo apt install tesseract-ocr-fra tesseract-ocr-deu tesseract-ocr-spa
 # Fedora: sudo dnf install tesseract-langpack-fra tesseract-langpack-deu tesseract-langpack-spa
 ```
 ## Usage
 1. **Enable the Plugin**: The OCR plugin is enabled by default in Cthulhu. If disabled, you can enable it through:
   - Cthulhu Preferences → Plugins → Check "OCR"
   - Or ensure `'OCR'` is in the `activePlugins` list in settings.py
 2. **Basic OCR Workflow**:
   - Navigate to content you want to OCR
   - Press the appropriate key combination
   - Listen for "Performing OCR on [window/desktop/clipboard]"
   - Wait for processing to complete
   - OCR results will be announced via speech
 3. **Best Practices**:
   - Ensure good contrast between text and background for better results
   - Use window OCR for focused content (faster processing)
   - Use desktop OCR for content spanning multiple windows
   - Use clipboard OCR for images from web browsers or image viewers
 ## Configuration
 ### OCR Settings
 The plugin uses the following default settings (configurable in plugin.py):
 ```python
 self._languageCode = 'eng'          # Tesseract language code
 self._scaleFactor = 3               # Image scaling for better OCR
 self._grayscaleImg = False          # Convert to grayscale
 self._invertImg = False             # Invert image colors
 self._blackWhiteImg = False         # Convert to black/white
 self._blackWhiteImgValue = 200      # B/W threshold value
 ```
 ### Changing OCR Language
 To change the default OCR language, modify `self._languageCode` in the plugin's `__init__` method:
 ```python
 # Examples:
 self._languageCode = 'fra'  # French
 self._languageCode = 'deu'  # German
 self._languageCode = 'spa'  # Spanish
 ```
 ## Troubleshooting
 ### Common Issues
 #### "No text found in OCR scan"
 - **Cause**: Poor image quality, unsupported language, or no text in captured area
 - **Solutions**:
  - Try different OCR mode (window vs desktop)
  - Ensure text has good contrast
  - Check if correct language pack is installed
  - Verify text is actually visible in the captured area
 #### "Missing dependencies" message
 - **Cause**: Required Python packages or Tesseract not installed
 - **Solution**: Install missing packages using commands above
 #### OCR taking too long
 - **Cause**: Large desktop screenshots or complex images
 - **Solutions**:
  - Use window OCR instead of desktop OCR when possible
  - Close unnecessary windows before desktop OCR
  - Consider adjusting `_scaleFactor` (lower = faster)
 #### No speech output
 - **Cause**: Cthulhu speech settings or audio issues
 - **Solutions**:
  - Check Cthulhu speech settings
  - Test other Cthulhu speech functions
  - Verify audio system is working
 ### Debug Information
 OCR plugin debug messages are logged to Cthulhu's debug output. To enable debug logging:
 ```bash
 cthulhu --debug > ocr_debug.log 2>&1
 ```
 Look for messages starting with "OCRDesktop:" in the log file.
 ## Technical Details
 ### Architecture
 - **Base Class**: Extends `cthulhu.plugin.Plugin`
 - **Threading**: Uses Python threading for non-blocking OCR processing
 - **Image Processing**: PIL/Pillow for image manipulation and enhancement
 - **OCR Engine**: Tesseract via pytesseract wrapper
 - **Integration**: Uses Cthulhu's speech system for output
 ### Image Processing Pipeline
 1. **Capture**: Screenshot via GDK pixbuf system
 2. **Scale**: Enlarge image by scale factor (default 3x)
 3. **Transform**: Apply filters (grayscale, invert, etc.) if enabled
 4. **OCR**: Process with Tesseract OCR engine
 5. **Cleanup**: Remove extra whitespace and format text
 6. **Present**: Announce results via Cthulhu speech
 ### Text Post-Processing
 The plugin automatically cleans OCR output by:
 - Removing multiple consecutive spaces
 - Eliminating empty lines
 - Trimming leading/trailing whitespace
 - Removing trailing newlines
 ## Development
 ### Plugin Structure
 ```
 src/cthulhu/plugins/OCR/
 ├── __init__.py          # Package import
 ├── plugin.py            # Main plugin implementation
 ├── plugin.info          # Plugin metadata
 ├── meson.build          # Build system integration
 └── README.md           # This documentation
 ```
 ### Key Methods
 - `_ocrActiveWindow()`: Captures and OCRs active window
 - `_ocrDesktop()`: Captures and OCRs entire desktop
 - `_ocrClipboard()`: OCRs image from clipboard
 - `_performOCR()`: Core OCR processing logic
 - `_presentOCRResult()`: Announces results via speech
 ### Extending the Plugin
 To add new OCR modes or features:
 1. Add new keybinding in `_registerKeybindings()`
 2. Create handler method following pattern `_ocrNewMode()`
 3. Implement image capture logic for new mode
 4. Use existing `_performOCR()` and `_presentOCRResult()` methods
 ## Credits
 - **Original ocrdesktop**: Created by Chrys (chrys87@users.noreply.github.com)
 - **Cthulhu Integration**: Adapted by Storm Dragon for Cthulhu plugin system
 - **Cthulhu Screen Reader**: https://git.stormux.org/storm/cthulhu
 - **Tesseract OCR**: https://github.com/tesseract-ocr/tesseract
 ## License
 This plugin is distributed under the GNU Lesser General Public License (LGPL) version 2.1 or later, consistent with the Cthulhu screen reader project.
 ## Support
 For issues, questions, or contributions:
 - **Cthulhu Repository**: https://git.stormux.org/storm/cthulhu
 - **Community**: IRC #stormux on irc.stormux.org
 - **Email**: storm_dragon@stormux.org
 ---
 *Part of the Cthulhu Screen Reader project - Making the desktop accessible for everyone.*
--- a/src/cthulhu/plugins/OCR/init.py
+++ b/src/cthulhu/plugins/OCR/init.py
@@ -0,0 +1,23 @@
 #!/usr/bin/env python3
 #
 # Copyright (c) 2025 Stormux
 # Copyright (c) 2022 Chrys (original ocrdesktop)
 #
 # This library is free software; you can redistribute it and/or
 # modify it under the terms of the GNU Lesser General Public
 # License as published by the Free Software Foundation; either
 # version 2.1 of the License, or (at your option) any later version.
 #
 # This library is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 # Lesser General Public License for more details.
 #
 # You should have received a copy of the GNU Lesser General Public
 # License along with this library; if not, write to the
 # Free Software Foundation, Inc., Franklin Street, Fifth Floor,
 # Boston MA  02110-1301 USA.
 """OCRDesktop plugin package."""
 from .plugin import OCRDesktop
--- a/src/cthulhu/plugins/OCR/meson.build
+++ b/src/cthulhu/plugins/OCR/meson.build
@@ -0,0 +1,14 @@
 ocrdesktop_python_sources = files([
  '__init__.py',
  'plugin.py'
 ])
 python3.install_sources(
  ocrdesktop_python_sources,
  subdir: 'cthulhu/plugins/OCRDesktop'
 )
 install_data(
  'plugin.info',
  install_dir: python3.get_install_dir() / 'cthulhu' / 'plugins' / 'OCRDesktop'
 )
--- a/src/cthulhu/plugins/OCR/plugin.info
+++ b/src/cthulhu/plugins/OCR/plugin.info
@@ -0,0 +1,8 @@
 name = OCR Desktop
 version = 4.0.0
 description = OCR accessibility tool for reading inaccessible windows and dialogs using Tesseract OCR
 authors = Storm Dragon <storm_dragon@stormux.org>
 website = https://github.com/chrys87/ocrdesktop
 copyright = Copyright 2022 Chrys, Copyright 2025 Stormux
 builtin = false
 hidden = false
--- a/src/cthulhu/plugins/OCR/plugin.py
+++ b/src/cthulhu/plugins/OCR/plugin.py
@@ -0,0 +1,470 @@
 #!/usr/bin/env python3
 #
 # Copyright (c) 2025 Stormux
 # Copyright (c) 2022 Chrys (original ocrdesktop)
 #
 # This library is free software; you can redistribute it and/or
 # modify it under the terms of the GNU Lesser General Public
 # License as published by the Free Software Foundation; either
 # version 2.1 of the License, or (at your option) any later version.
 """OCRDesktop plugin for Cthulhu screen reader."""
 import logging
 import os
 import sys
 import locale
 import time
 import re
 import tempfile
 import threading
 from mimetypes import MimeTypes
 from cthulhu.plugin import Plugin, cthulhu_hookimpl
 from cthulhu import debug
 # Note: Removed complex beep system - simple announcements work perfectly!
 # PIL
 try:
    from PIL import Image
    from PIL import ImageOps
    PIL_AVAILABLE = True
 except ImportError:
    PIL_AVAILABLE = False
 # pytesseract
 try:
    import pytesseract
    from pytesseract import Output
    PYTESSERACT_AVAILABLE = True
 except ImportError:
    PYTESSERACT_AVAILABLE = False
 # pdf2image
 try:
    from pdf2image import convert_from_path
    PDF2IMAGE_AVAILABLE = True
 except ImportError:
    PDF2IMAGE_AVAILABLE = False
 # scipy
 try:
    from scipy.spatial import KDTree
    SCIPY_AVAILABLE = True
 except ImportError:
    SCIPY_AVAILABLE = False
 # webcolors
 try:
    from webcolors import CSS3_HEX_TO_NAMES
    from webcolors import hex_to_rgb
    WEBCOLORS_AVAILABLE = True
 except ImportError:
    WEBCOLORS_AVAILABLE = False
 # GTK/GDK/Wnck
 try:
    import gi
    gi.require_version("Gtk", "3.0")
    gi.require_version("Gdk", "3.0")
    gi.require_version("Wnck", "3.0")
    from gi.repository import Gtk, Gdk, Wnck
    GTK_AVAILABLE = True
 except ImportError:
    GTK_AVAILABLE = False
 logger = logging.getLogger(__name__)
 class OCRDesktop(Plugin):
    """OCR Desktop accessibility plugin for reading inaccessible windows."""
    def __init__(self, *args, **kwargs):
        """Initialize the plugin."""
        super().__init__(*args, **kwargs)
        debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Plugin initialized", True)
        # Keybinding storage
        self._kb_binding_window = None
        self._kb_binding_desktop = None
        self._kb_binding_clipboard = None
        # OCR settings
        self._languageCode = 'eng'
        self._scaleFactor = 3
        self._grayscaleImg = False
        self._invertImg = False
        self._blackWhiteImg = False
        self._blackWhiteImgValue = 200
        self._colorCalculation = False
        self._colorCalculationMax = 3
        # Internal state
        self._img = []
        self._modifiedImg = []
        self._OCRText = ''
        self._offsetXpos = 0
        self._offsetYpos = 0
        self._activated = False
        # Progress feedback
        self._is_processing = False
        # Color analysis
        self._kdtDB = None
        self.colorNames = []
        self.colorCache = {}
        # Set locale for tesseract
        locale.setlocale(locale.LC_ALL, 'C')
        # Check dependencies
        self._checkDependencies()
    def _checkDependencies(self):
        """Check if required dependencies are available."""
        missing_deps = []
        if not PIL_AVAILABLE:
            missing_deps.append("python3-pillow")
        if not PYTESSERACT_AVAILABLE:
            missing_deps.append("python-pytesseract")
        if not GTK_AVAILABLE:
            missing_deps.append("GTK3/GDK/Wnck")
        if missing_deps:
            debug.printMessage(debug.LEVEL_INFO, 
                f"OCRDesktop: Missing dependencies: {', '.join(missing_deps)}", True)
            return False
        return True
    @cthulhu_hookimpl
    def activate(self, plugin=None):
        """Activate the plugin."""
        if plugin is not None and plugin is not self:
            return
        if self._activated:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Already activated", True)
            return
        try:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Plugin activation starting", True)
            if not self.app:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: ERROR - No app reference", True)
                return
            if not self._checkDependencies():
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Cannot activate - missing dependencies", True)
                return
            # Register keybindings
            self._registerKeybindings()
            self._activated = True
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Plugin activated successfully", True)
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error activating: {e}", True)
            import traceback
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: {traceback.format_exc()}", True)
    @cthulhu_hookimpl
    def deactivate(self, plugin=None):
        """Deactivate the plugin."""
        if plugin is not None and plugin is not self:
            return
        self._activated = False
        debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Plugin deactivated", True)
    def _registerKeybindings(self):
        """Register plugin keybindings."""
        try:
            # OCR active window
            self._kb_binding_window = self.registerGestureByString(
                self._ocrActiveWindow,
                "OCR read active window",
                'kb:cthulhu+control+w'
            )
            # OCR entire desktop
            self._kb_binding_desktop = self.registerGestureByString(
                self._ocrDesktop,
                "OCR read entire desktop",
                'kb:cthulhu+control+d'
            )
            # OCR from clipboard
            self._kb_binding_clipboard = self.registerGestureByString(
                self._ocrClipboard,
                "OCR read image from clipboard",
                'kb:cthulhu+control+shift+c'
            )
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Keybindings registered", True)
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error registering keybindings: {e}", True)
    def _announceOCRStart(self, ocr_type):
        """Announce the start of OCR operation."""
        try:
            message = f"Performing OCR on {ocr_type}"
            if self.app:
                state = self.app.getDynamicApiManager().getAPI('CthulhuState')
                if state and state.activeScript:
                    state.activeScript.presentMessage(message, resetStyles=False)
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: {message}", True)
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error announcing OCR start: {e}", True)
    def _ocrActiveWindow(self, script=None, inputEvent=None):
        """OCR the active window."""
        try:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: OCR active window requested", True)
            if self._is_processing:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Already processing, ignoring request", True)
                return True
            self._is_processing = True
            self._announceOCRStart("window")
            try:
                if self._screenShotWindow():
                    self._performOCR()
                    self._presentOCRResult()
            finally:
                self._is_processing = False
            return True
        except Exception as e:
            self._is_processing = False
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error in OCR window: {e}", True)
            return False
    def _ocrDesktop(self, script=None, inputEvent=None):
        """OCR the entire desktop."""
        try:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: OCR desktop requested", True)
            if self._is_processing:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Already processing, ignoring request", True)
                return True
            self._is_processing = True
            self._announceOCRStart("desktop")
            try:
                if self._screenShotDesktop():
                    self._performOCR()
                    self._presentOCRResult()
            finally:
                self._is_processing = False
            return True
        except Exception as e:
            self._is_processing = False
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error in OCR desktop: {e}", True)
            return False
    def _ocrClipboard(self, script=None, inputEvent=None):
        """OCR image from clipboard."""
        try:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: OCR clipboard requested", True)
            if self._is_processing:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Already processing, ignoring request", True)
                return True
            self._is_processing = True
            self._announceOCRStart("clipboard")
            try:
                if self._readClipboard():
                    self._performOCR()
                    self._presentOCRResult()
            finally:
                self._is_processing = False
            return True
        except Exception as e:
            self._is_processing = False
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error in OCR clipboard: {e}", True)
            return False
    def _screenShotWindow(self):
        """Take screenshot of active window."""
        if not GTK_AVAILABLE:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: GTK not available for screenshots", True)
            return False
        try:
            time.sleep(0.3)  # Brief delay
            gdkCurrDesktop = Gdk.get_default_root_window()
            currWnckScreen = Wnck.Screen.get_default()
            currWnckScreen.force_update()
            currWnckWindow = currWnckScreen.get_active_window()
            if not currWnckWindow:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: No active window found", True)
                return False
            self._offsetXpos, self._offsetYpos, wnckWidth, wnckHeight = currWnckWindow.get_geometry()
            pixBuff = Gdk.pixbuf_get_from_window(gdkCurrDesktop, self._offsetXpos, self._offsetYpos, wnckWidth, wnckHeight)
            if pixBuff:
                self._img = [self._pixbuf2image(pixBuff)]
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Window screenshot captured", True)
                return True
            else:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Failed to capture window screenshot", True)
                return False
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error taking window screenshot: {e}", True)
            return False
    def _screenShotDesktop(self):
        """Take screenshot of entire desktop."""
        if not GTK_AVAILABLE:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: GTK not available for screenshots", True)
            return False
        try:
            time.sleep(0.3)  # Brief delay
            currDesktop = Gdk.get_default_root_window()
            pixBuff = Gdk.pixbuf_get_from_window(currDesktop, 0, 0, currDesktop.get_width(), currDesktop.get_height())
            if pixBuff:
                self._img = [self._pixbuf2image(pixBuff)]
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Desktop screenshot captured", True)
                return True
            else:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Failed to capture desktop screenshot", True)
                return False
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error taking desktop screenshot: {e}", True)
            return False
    def _readClipboard(self):
        """Read image from clipboard."""
        if not GTK_AVAILABLE:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: GTK not available for clipboard", True)
            return False
        try:
            clipboardObj = Gtk.Clipboard.get(Gdk.SELECTION_CLIPBOARD)
            pixBuff = clipboardObj.wait_for_image()
            if pixBuff:
                self._img = [self._pixbuf2image(pixBuff)]
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Image read from clipboard", True)
                return True
            else:
                debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: No image found in clipboard", True)
                return False
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error reading clipboard: {e}", True)
            return False
    def _pixbuf2image(self, pix):
        """Convert GdkPixbuf to PIL Image."""
        data = pix.get_pixels()
        w = pix.props.width
        h = pix.props.height
        stride = pix.props.rowstride
        mode = "RGB"
        if pix.props.has_alpha:
            mode = "RGBA"
        im = Image.frombytes(mode, (w, h), data, "raw", mode, stride)
        return im
    def _scaleImg(self, img):
        """Scale image for better OCR results."""
        width_screen, height_screen = img.size
        width_screen = width_screen * self._scaleFactor
        height_screen = height_screen * self._scaleFactor
        scaledImg = img.resize((width_screen, height_screen), Image.Resampling.BICUBIC)
        return scaledImg
    def _transformImg(self, img):
        """Transform image with various filters for better OCR."""
        modifiedImg = self._scaleImg(img)
        if self._invertImg:
            modifiedImg = ImageOps.invert(modifiedImg)
        if self._grayscaleImg:
            modifiedImg = ImageOps.grayscale(modifiedImg)
        if self._blackWhiteImg:
            lut = [255 if v > self._blackWhiteImgValue else 0 for v in range(256)]
            modifiedImg = modifiedImg.point(lut)
        return modifiedImg
    def _performOCR(self):
        """Perform OCR on captured images."""
        if not PYTESSERACT_AVAILABLE:
            debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Tesseract not available", True)
            return
        debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: Starting OCR", True)
        self._OCRText = ''
        for img in self._img:
            modifiedImg = self._transformImg(img)
            try:
                # Simple text extraction
                text = pytesseract.image_to_string(modifiedImg, lang=self._languageCode, config='--psm 4')
                self._OCRText += text + '\n'
            except Exception as e:
                debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: OCR error: {e}", True)
        # Clean up text
        self._cleanOCRText()
        debug.printMessage(debug.LEVEL_INFO, "OCRDesktop: OCR completed", True)
    def _cleanOCRText(self):
        """Clean up OCR text output."""
        # Remove multiple spaces
        regexSpace = re.compile('[^\S\r\n]{2,}')
        self._OCRText = regexSpace.sub(' ', self._OCRText)
        # Remove empty lines
        regexSpace = re.compile('\n\s*\n')
        self._OCRText = regexSpace.sub('\n', self._OCRText)
        # Remove trailing spaces
        regexSpace = re.compile('\s*\n')
        self._OCRText = regexSpace.sub('\n', self._OCRText)
        # Remove leading spaces
        regexSpace = re.compile('^\s')
        self._OCRText = regexSpace.sub('', self._OCRText)
        # Remove trailing newlines
        self._OCRText = self._OCRText.strip()
    def _presentOCRResult(self):
        """Present OCR result to user via speech."""
        try:
            if not self._OCRText.strip():
                message = "No text found in OCR scan"
            else:
                message = f"OCR result: {self._OCRText}"
            if self.app:
                state = self.app.getDynamicApiManager().getAPI('CthulhuState')
                if state and state.activeScript:
                    state.activeScript.presentMessage(message, resetStyles=False)
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Presented result: {len(self._OCRText)} characters", True)
        except Exception as e:
            debug.printMessage(debug.LEVEL_INFO, f"OCRDesktop: Error presenting result: {e}", True)
--- a/src/cthulhu/plugins/meson.build
+++ b/src/cthulhu/plugins/meson.build
@@ -5,6 +5,7 @@ subdir('Clipboard')
 subdir('DisplayVersion')
 subdir('HelloCthulhu')
 subdir('IndentationAudio')
 subdir('OCR')
 subdir('PluginManager')
 subdir('SimplePluginSystem')
 subdir('hello_world')
--- a/src/cthulhu/settings.py
+++ b/src/cthulhu/settings.py
@@ -431,7 +431,7 @@ presentChatRoomLast = False
 presentLiveRegionFromInactiveTab = False
 # Plugins
-activePlugins = ['AIAssistant', 'DisplayVersion', 'PluginManager', 'HelloCthulhu', 'ByeCthulhu']
+activePlugins = ['AIAssistant', 'DisplayVersion', 'OCR', 'PluginManager', 'HelloCthulhu', 'ByeCthulhu']
 # AI Assistant settings (disabled by default for opt-in behavior)
 aiAssistantEnabled = True