AI capabilities added. Working 90 percent with ollama, more providers and functionality coming soon.

2025-08-03 00:07:59 -04:00
parent 9ead764b2e
commit a8672165d8
14 changed files with 1893 additions and 35 deletions
@@ -335,10 +335,417 @@ subprojects/spiel.wrap            # Subproject integration
 3. **Plugin System**: How to maintain Cthulhu's plugin advantage while integrating Orca improvements?
 4. **Version Strategy**: Selective feature backporting vs. major version sync?

+## AI Assistant Integration
+
+### **NEW FEATURE**: AI-Powered Accessibility Assistant
+Cthulhu now includes an optional AI assistant plugin for enhanced accessibility support:
+
+- **Vision Analysis**: Screenshots + AT-SPI data for understanding unlabeled UI elements
+- **Safe Actions**: Confirmed element clicking and navigation assistance  
+- **Multi-Provider Support**: Claude, ChatGPT, Gemini, and Ollama backends
+- **Privacy-First**: Disabled by default, requires explicit opt-in and API key configuration
+
+### AI Assistant Configuration
+```bash
+# Access via Cthulhu Preferences
+~/.local/bin/cthulhu -s  # Opens preferences dialog
+# Navigate to "AI Assistant" tab
+# 1. Check "Enable AI Assistant" 
+# 2. Select provider (Claude, ChatGPT, Gemini, Ollama)
+# 3. Set API key file path
+# 4. Configure safety and quality settings
+```
+
+### AI Provider Setup
+
+#### 1. Claude (Anthropic) - **Recommended**
+```bash
+# Get API key from: https://console.anthropic.com/
+# 1. Sign up/login → "Get API Keys" → Create new key
+# 2. Copy the key (starts with "sk-ant-...")
+# 3. Save to file:
+mkdir -p ~/.config/cthulhu
+echo "sk-ant-your-actual-key-here" > ~/.config/cthulhu/claude-api-key
+chmod 600 ~/.config/cthulhu/claude-api-key
+
+# Pricing: ~$3 per million input tokens, ~$15 per million output tokens
+# Best vision capabilities and safety for accessibility use
+```
+
+#### 2. ChatGPT (OpenAI)
+```bash
+# Get API key from: https://platform.openai.com/api-keys  
+# 1. Sign up/login → "Create new secret key"
+# 2. Copy immediately (can't view again, starts with "sk-...")
+# 3. Save to file:
+mkdir -p ~/.config/cthulhu
+echo "sk-your-actual-openai-key" > ~/.config/cthulhu/openai-api-key
+chmod 600 ~/.config/cthulhu/openai-api-key
+
+# Pricing: ~$2.50 per million input tokens, ~$10 per million output tokens
+# Good vision capabilities, widely supported
+```
+
+#### 3. Gemini (Google) 
+```bash
+# Get API key from: https://aistudio.google.com/app/apikey
+# 1. Sign up/login → "Create API key" 
+# 2. Copy the generated key
+# 3. Save to file:
+mkdir -p ~/.config/cthulhu
+echo "your-actual-gemini-key" > ~/.config/cthulhu/gemini-api-key
+chmod 600 ~/.config/cthulhu/gemini-api-key
+
+# Pricing: Free tier (15 requests/min), then ~$1.25 per million tokens
+# Good for testing, has generous free allowance
+```
+
+#### 4. Ollama (Local) - **Privacy-Focused**
+```bash
+# Install Ollama (no API key needed!)
+sudo pacman -S ollama  # Arch Linux
+# OR: curl -fsSL https://ollama.ai/install.sh | sh
+
+# Start service
+systemctl --user enable ollama
+systemctl --user start ollama
+
+# Download vision-capable model (required for AI assistant)
+ollama pull llama3.2-vision  # 7.9GB download
+# OR smaller model: ollama pull moondream    # 1.7GB
+
+# Verify installation
+ollama list  # Should show downloaded models
+
+# No API key needed - runs entirely offline!
+# Free to use, privacy-focused, but slower than cloud providers
+```
+
+### AI Assistant Usage Patterns
+- **Information Queries**: "What does this unlabeled button do?"
+- **Navigation Help**: "Where is the login form?" 
+- **Action Assistance**: "Click the submit button" (with confirmation)
+- **Layout Understanding**: "Describe the main sections of this page"
+
+### Safety Framework
+- **Confirmation Required**: All actions require user approval by default
+- **Action Descriptions**: Clear explanation of what will happen
+- **Safe Defaults**: Conservative timeouts and quality settings
+- **Privacy Protection**: API keys stored securely, no data logging
+
+### Troubleshooting AI Assistant Setup
+
+#### Common Issues
+```bash
+# Check if AI settings loaded correctly
+~/.local/bin/cthulhu -s  # Open preferences, check AI Assistant tab
+
+# Verify API key file permissions and format
+ls -la ~/.config/cthulhu/*-api-key  # Should show 600 permissions
+cat ~/.config/cthulhu/claude-api-key  # Should contain only the API key
+
+# Test Ollama connection
+curl http://localhost:11434/api/version  # Should return Ollama version
+ollama ps  # Should show running models
+
+# Check dependencies
+python3 -c "import requests, PIL; print('Dependencies OK')"
+
+# Test screenshot capability (requires X11/Wayland)
+python3 -c "
+from gi.repository import Gdk
+window = Gdk.get_default_root_window()
+print('Screenshot capability available')
+"
+```
+
+#### Required Permissions
+- **File Access**: API key files in `~/.config/cthulhu/`
+- **Screen Access**: Screenshot capture (automatic on most setups)
+- **Network Access**: HTTP requests to AI providers (except Ollama)
+- **AT-SPI Access**: Accessibility tree traversal (enabled by default)
+
+## Cthulhu Plugin System - Developer Reference
+
+### **Plugin Architecture Overview**
+
+Cthulhu uses a **pluggy-based plugin system** with the following components:
+
+1. **Plugin Manager**: `src/cthulhu/plugin_system_manager.py` - Central plugin loading/management
+2. **Base Plugin Class**: `src/cthulhu/plugin.py` - Provides common functionality
+3. **Hook System**: Uses `@cthulhu_hookimpl` decorators for lifecycle management
+4. **Plugin Discovery**: Automatic scanning of `src/cthulhu/plugins/` and `~/.local/share/cthulhu/plugins/`
+
+### **Plugin Directory Structure**
+
+Every plugin must follow this exact structure:
+```
+src/cthulhu/plugins/YourPlugin/
+├── __init__.py          # Import: from .plugin import YourPlugin
+├── plugin.py            # Main plugin class
+├── plugin.info          # Metadata (name, version, description)
+└── Makefile.am          # Build system integration
+```
+
+### **Essential Plugin Files**
+
+#### **`__init__.py`** - Package Import
+```python
+from .plugin import YourPlugin
+```
+
+#### **`plugin.info`** - Metadata
+```ini
+name = Your Plugin Name
+version = 1.0.0
+description = What your plugin does
+authors = Your Name <email@example.com>
+website = https://example.com
+copyright = Copyright 2025
+builtin = false
+hidden = false
+```
+
+#### **`Makefile.am`** - Build Integration
+```makefile
+cthulhu_python_PYTHON = \
+	__init__.py \
+	plugin.info \
+	plugin.py
+
+cthulhu_pythondir=$(pkgpythondir)/plugins/YourPlugin
+```
+
+### **Plugin Class Template**
+
+```python
+#!/usr/bin/env python3
+import logging
+from cthulhu.plugin import Plugin, cthulhu_hookimpl
+
+logger = logging.getLogger(__name__)
+
+class YourPlugin(Plugin):
+    """Your plugin description."""
+    
+    def __init__(self, *args, **kwargs):
+        """Initialize the plugin."""
+        super().__init__(*args, **kwargs)
+        logger.info("YourPlugin initialized")
+        
+        # Keybinding storage - use individual variables, NOT dictionaries
+        self._kb_binding = None
+        
+    @cthulhu_hookimpl
+    def activate(self, plugin=None):
+        """Activate the plugin."""
+        if plugin is not None and plugin is not self:
+            return
+            
+        try:
+            logger.info("=== YourPlugin activation starting ===")
+            
+            # Register keybindings
+            self._register_keybinding()
+            
+            logger.info("YourPlugin activated successfully")
+            return True
+            
+        except Exception as e:
+            logger.error(f"Error activating YourPlugin: {e}")
+            return False
+    
+    @cthulhu_hookimpl
+    def deactivate(self, plugin=None):
+        """Deactivate the plugin."""
+        if plugin is not None and plugin is not self:
+            return
+            
+        logger.info("Deactivating YourPlugin")
+        self._kb_binding = None
+        return True
+        
+    def _register_keybinding(self):
+        """Register plugin keybindings."""
+        try:
+            # CRITICAL: Use this exact parameter order!
+            self._kb_binding = self.registerGestureByString(
+                self._your_handler_method,    # Handler method (first)
+                "Description of action",      # Description (second) 
+                'kb:cthulhu+your+keys'       # Gesture string (third)
+            )
+            
+            if self._kb_binding:
+                logger.info(f"Registered keybinding: {gesture_string}")
+            else:
+                logger.error(f"Failed to register keybinding")
+                
+        except Exception as e:
+            logger.error(f"Error registering keybinding: {e}")
+    
+    def _your_handler_method(self, script=None, inputEvent=None):
+        """Handle the keybinding activation."""
+        try:
+            logger.info("Keybinding triggered")
+            
+            # Your plugin logic here
+            
+            return True
+        except Exception as e:
+            logger.error(f"Error in handler: {e}")
+            return False
+```
+
+### **🚨 CRITICAL Keybinding Patterns**
+
+#### **✅ CORRECT Pattern (What Works)**
+```python
+# Individual binding storage (NOT dictionaries)
+self._kb_binding = None
+self._kb_binding_action1 = None  
+self._kb_binding_action2 = None
+
+# Correct registerGestureByString parameter order
+self._kb_binding = self.registerGestureByString(
+    self._handler_method,        # 1st: Handler method
+    "Action description",        # 2nd: Description  
+    'kb:cthulhu+your+keys'      # 3rd: Gesture string
+)
+```
+
+#### **❌ INCORRECT Patterns (What Fails)**
+```python
+# DON'T use dictionaries for keybinding storage
+self._kb_bindings = {}  # ❌ WRONG
+self._kb_bindings['action'] = self.registerGestureByString(...)  # ❌ WRONG
+
+# DON'T use wrong parameter order
+self.registerGestureByString(
+    'kb:cthulhu+keys',          # ❌ WRONG ORDER
+    "Description", 
+    self._handler_method
+)
+
+# DON'T use description as handler parameter
+self.registerGestureByString(
+    self._handler_method,
+    'kb:cthulhu+keys',          # ❌ WRONG ORDER
+    "Description"
+)
+```
+
+### **Plugin Registration & Activation**
+
+#### **Add to Build System**
+1. **Add to `src/cthulhu/plugins/Makefile.am`**:
+   ```makefile
+   SUBDIRS = YourPlugin OtherPlugin1 OtherPlugin2 ...
+   ```
+
+2. **Add to `configure.ac`**:
+   ```
+   src/cthulhu/plugins/YourPlugin/Makefile
+   ```
+
+#### **Add to Default Active Plugins**
+In `src/cthulhu/settings.py`:
+```python
+activePlugins = ['YourPlugin', 'DisplayVersion', 'PluginManager', ...]
+```
+
+### **Plugin Lifecycle Events**
+
+1. **`__init__`**: Plugin instance created
+2. **`activate`**: Plugin enabled (register keybindings, connect events)
+3. **`deactivate`**: Plugin disabled (cleanup, disconnect)
+
+**Note**: `activate()` may be called multiple times for different script contexts.
+
+### **Common Plugin Patterns**
+
+#### **Settings Integration**
+```python
+from cthulhu import settings_manager
+
+class YourPlugin(Plugin):
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self._settings_manager = settings_manager.getManager()
+    
+    def activate(self, plugin=None):
+        # Check if plugin should be active
+        enabled = self._settings_manager.getSetting('yourPluginEnabled')
+        if not enabled:
+            return
+```
+
+#### **Message Presentation**
+```python
+def _present_message(self, message):
+    """Present a message to the user via speech."""
+    try:
+        if self.app:
+            state = self.app.getDynamicApiManager().getAPI('CthulhuState')
+            if state and state.activeScript:
+                state.activeScript.presentMessage(message, resetStyles=False)
+    except Exception as e:
+        logger.error(f"Error presenting message: {e}")
+```
+
+#### **Sound Generation**
+```python
+from cthulhu import sound
+from cthulhu.sound_generator import Tone
+
+def _play_sound(self):
+    player = sound.getPlayer()
+    tone = Tone(duration=0.15, frequency=400, volumeMultiplier=0.7)
+    player.play(tone, interrupt=False)
+```
+
+### **Debugging Plugin Issues**
+
+#### **Common Debug Techniques**
+1. **Add debug output to both logger and print**:
+   ```python
+   logger.info("Plugin message")
+   print("DEBUG: Plugin message")  # Shows in terminal
+   ```
+
+2. **Check plugin loading**:
+   ```python
+   # In __init__
+   with open('/tmp/your_plugin_debug.log', 'a') as f:
+       f.write("Plugin loaded\n")
+   ```
+
+3. **Verify keybinding registration**:
+   ```python
+   if self._kb_binding:
+       print(f"DEBUG: Keybinding registered: {self._kb_binding}")
+   else:
+       print("DEBUG: Keybinding registration FAILED")
+   ```
+
+#### **Common Issues & Solutions**
+
+| Issue | Symptom | Solution |
+|-------|---------|----------|
+| Plugin not loading | No __init__ debug output | Check `activePlugins` list |
+| Keybindings not working | "stored for later registration" | Use correct parameter order |
+| Import errors | Plugin fails to activate | Check module imports and dependencies |
+| Settings not loading | Default values used | Verify settings key names |
+
+### **Working Plugin Examples**
+- **`DisplayVersion`**: Simple keybinding + message
+- **`PluginManager`**: GUI dialog + settings management  
+- **`IndentationAudio`**: Event listening + sound generation
+- **`AIAssistant`**: Complex settings + multi-keybinding + external APIs
+
 ## D-Bus Remote Controller Integration

-### **NEW FEATURE**: D-Bus Service for Remote Control
-Cthulhu now includes a D-Bus service (ported from Orca v49.alpha) for external control and automation:
+### **EXISTING FEATURE**: D-Bus Service for Remote Control
+Cthulhu includes a D-Bus service (ported from Orca v49.alpha) for external control and automation:

 - **Service Name**: `org.stormux.Cthulhu.Service`
 - **Object Path**: `/org/stormux/Cthulhu/Service`
@@ -133,6 +133,7 @@ src/cthulhu/scripts/toolkits/Qt/Makefile
 src/cthulhu/scripts/toolkits/WebKitGtk/Makefile
 src/cthulhu/scripts/toolkits/gtk/Makefile
 src/cthulhu/plugins/Makefile
+src/cthulhu/plugins/AIAssistant/Makefile
 src/cthulhu/plugins/ByeCthulhu/Makefile
 src/cthulhu/plugins/HelloCthulhu/Makefile
 src/cthulhu/plugins/Clipboard/Makefile
@@ -1,46 +1,59 @@
 [Desktop Entry]
 Type=Application
 Name[an]=Lector de pantalla Cthulhu
-Name[ast]=Llector de pantalla Cthulhu
-Name[be]=ÐÑÑÐÑÑÐ Cthulhu
-Name[bg]=Cthulhu âÐÐÐ ÑÑÑName[bs]=Cthulhu ÄtaÄekrana
+Name[ast]=Llector de pantalla d'Cthulhu
+Name[be]=Чытач з экрана Cthulhu
+Name[bg]=Cthulhu — екранен четец
+Name[bs]=Cthulhu čitač ekrana
 Name[ca]=Lector de pantalla Cthulhu
-Name[cs]=ÄeÄa obrazovky Cthulhu
-Name[da]=SkÃmlÃeren Cthulhu
+Name[cs]=Čtečka obrazovky Cthulhu
+Name[da]=Skærmlæseren Cthulhu
 Name[de]=Cthulhu-Bildschirmleser
-Name[el]=ÎÎÎÏÎ ÎÏÎ Cthulhu
+Name[el]=Αναγνώστης οθόνης Cthulhu
 Name[en_GB]=Cthulhu Screen Reader
-Name[eo]=Ekranlegilo Cthulhu
+Name[eo]=Ekranlegilo Orko
 Name[es]=Lector de pantalla Cthulhu
 Name[eu]=Cthulhu pantaila-irakurlea
-Name[fa]=ØØâØ Cthulhu
-Name[fi]=Cthulhu-nÃtÃlukija
-Name[fr]=Lecteur dâran Cthulhu
+Name[fa]=صفحه‌خوان اورکا
+Name[fi]=Cthulhu-näytönlukija
+Name[fr]=Lecteur d’écran Cthulhu
 Name[gl]=Lector da pantalla Cthulhu
-Name[he]=××××× Cthulhu
-Name[hi]=Cthulhu àààà
-Name[hu]=Cthulhu kÃernyÅlvasÃName[id]=Pembaca Layar Cthulhu
-Name[is]=Cthulhu skjÃestur
+Name[he]=מקריא המסך Cthulhu
+Name[hi]=Cthulhu स्क्रीन वाचक
+Name[hu]=Orka képernyőolvasó
+Name[id]=Pembaca Layar Cthulhu
+Name[is]=Cthulhu skjálestur
 Name[it]=Lettore schermo Cthulhu
-Name[ka]=Cthulhu - ááááááName[kk]=Cthulhu ÑÑÐÐ ÐÐÑ ÒÐÐÐÑ
-Name[lt]=Cthulhu ekrano skaityklÄName[lv]=Cthulhu ekrÄa lasÄÄs
-Name[mk]=Cthulhu ÑÑÑÐ ÐÑÐÑName[nb]=Cthulhu skjermleser
-Name[ne]=Cthulhu àà ààame[nl]=Cthulhu schermlezer
+Name[ka]=Cthulhu - ეკრანის მკითხველი
+Name[kk]=Cthulhu экраннан оқитын қолданбасы
+Name[lt]=Cthulhu ekrano skaityklė
+Name[lv]=Cthulhu ekrāna lasītājs
+Name[mk]=Cthulhu читач на екранот
+Name[nb]=Cthulhu skjermleser
+Name[ne]=ओर्का दृष्टि वाचक
+Name[nl]=Cthulhu schermlezer
 Name[oc]=Lector d'ecran Cthulhu
-Name[pa]=Cthulhu ààààame[pl]=Czytnik ekranowy Cthulhu
-Name[pt]=Leitor de ecrÃCthulhu
+Name[pa]=ਓਰਕਾ ਸਕਰੀਨ ਰੀਡਰ
+Name[pl]=Czytnik ekranowy Cthulhu
+Name[pt]=Leitor de ecrã Cthulhu
 Name[pt_BR]=Leitor de tela Cthulhu
 Name[ro]=Cititorul de ecran Cthulhu
-Name[ru]=ÐÑÐÑ ÐÐÐ Cthulhu
-Name[sl]=Zaslonski bralnik Cthulhu
-Name[sr]=ÐÑÑÐÑÐ Cthulhu
-Name[sv]=Cthulhu skÃmlÃare
-Name[ta]=Cthulhu àààà 
-Name[te]=Cthulhu ààà
-Name[tg]=ÐÐÐÐ ÑÑÐ Cthulhu
+Name[ru]=Экранный диктор Cthulhu
+Name[sl]=Zaslonski bralnik Orka
+Name[sr]=Читач екрана Орка
+Name[sv]=Cthulhu skärmläsare
+Name[ta]=ஆர்கா திரை படிப்பி 
+Name[te]=ఓర్కా తెరచదువరి
+Name[tg]=Хонандаи экрани Cthulhu
 Name[tr]=Cthulhu Ekran Okuyucu
-Name[ug]=Cthulhu ØÙØ ØÙØ
-Name[uk]=ÐÑÑÐÐ ÑÑÐÑÐÐÑÐ ÂthulhuÂName[zh_CN]=Cthulhu åèName[zh_HK]=Cthulhu èèName=Cthulhu Screen Reader
+Name[ug]=Cthulhu ئېكران ئوقۇغۇ
+Name[uk]=Інструмент читання з екрана «Cthulhu»
+Name[zh_CN]=Cthulhu 屏幕阅读器
+Name[zh_HK]=Cthulhu 螢幕閱讀器
+Name=Cthulhu Screen Reader
 Exec=cthulhu
 NoDisplay=true
+AutostartCondition=GSettings org.gnome.desktop.a11y.applications screen-reader-enabled
 X-GNOME-AutoRestart=true
+#X-GNOME-Autostart-Phase=Initialization
+OnlyShowIn=GNOME;MATE;Unity;Cinnamon;
@@ -31,6 +31,10 @@ depends=(
  python-dasbus
  libpeas
  
+  # AI Assistant dependencies (for screenshots and HTTP requests)
+  python-pillow
+  python-requests
+  
  # Desktop integration
  gsettings-desktop-schemas
  hicolor-icon-theme
@@ -49,6 +53,12 @@ optdepends=(
  'festival: Alternative TTS engine' 
  'flite: Lightweight TTS engine'
  'espeak: Legacy TTS engine'
+  
+  # AI Assistant providers (optional)
+  'python-anthropic: Claude AI provider support'
+  'python-openai: ChatGPT AI provider support' 
+  'python-google-generativeai: Gemini AI provider support'
+  'ollama: Local AI model support'
 )
 makedepends=(
  git
@@ -3412,6 +3412,211 @@
                <property name="tab_fill">False</property>
              </packing>
            </child>
+            <child>
+              <object class="GtkGrid" id="aiPage">
+                <property name="visible">True</property>
+                <property name="can_focus">False</property>
+                <property name="margin_left">12</property>
+                <property name="margin_right">12</property>
+                <property name="margin_top">12</property>
+                <property name="margin_bottom">12</property>
+                <property name="row_spacing">6</property>
+                <property name="column_spacing">12</property>
+                <child>
+                  <object class="GtkCheckButton" id="enableAICheckButton">
+                    <property name="label" translatable="yes">Enable AI Assistant</property>
+                    <property name="visible">True</property>
+                    <property name="can_focus">True</property>
+                    <property name="receives_default">False</property>
+                    <property name="use_underline">True</property>
+                    <property name="draw_indicator">True</property>
+                    <signal name="toggled" handler="enableAIToggled" swapped="no"/>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">0</property>
+                    <property name="width">2</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkLabel" id="aiProviderLabel">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="halign">start</property>
+                    <property name="label" translatable="yes">_Provider:</property>
+                    <property name="use_underline">True</property>
+                    <property name="mnemonic_widget">aiProviderCombo</property>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">1</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkComboBoxText" id="aiProviderCombo">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="hexpand">True</property>
+                    <items>
+                      <item translatable="yes">Claude (Anthropic)</item>
+                      <item translatable="yes">ChatGPT (OpenAI)</item>
+                      <item translatable="yes">Gemini (Google)</item>
+                      <item translatable="yes">Ollama (Local)</item>
+                    </items>
+                    <signal name="changed" handler="aiProviderChanged" swapped="no"/>
+                  </object>
+                  <packing>
+                    <property name="left_attach">1</property>
+                    <property name="top_attach">1</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkLabel" id="aiApiKeyLabel">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="halign">start</property>
+                    <property name="label" translatable="yes">API _Key File:</property>
+                    <property name="use_underline">True</property>
+                    <property name="mnemonic_widget">aiApiKeyEntry</property>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">2</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkBox" id="aiApiKeyBox">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="spacing">6</property>
+                    <child>
+                      <object class="GtkEntry" id="aiApiKeyEntry">
+                        <property name="visible">True</property>
+                        <property name="can_focus">True</property>
+                        <property name="hexpand">True</property>
+                        <property name="placeholder_text" translatable="yes">Path to API key file</property>
+                        <signal name="changed" handler="aiApiKeyChanged" swapped="no"/>
+                      </object>
+                      <packing>
+                        <property name="expand">True</property>
+                        <property name="fill">True</property>
+                        <property name="position">0</property>
+                      </packing>
+                    </child>
+                    <child>
+                      <object class="GtkButton" id="aiApiKeyBrowseButton">
+                        <property name="label" translatable="yes">_Browse...</property>
+                        <property name="visible">True</property>
+                        <property name="can_focus">True</property>
+                        <property name="receives_default">True</property>
+                        <property name="use_underline">True</property>
+                        <signal name="clicked" handler="aiApiKeyBrowseClicked" swapped="no"/>
+                      </object>
+                      <packing>
+                        <property name="expand">False</property>
+                        <property name="fill">False</property>
+                        <property name="position">1</property>
+                      </packing>
+                    </child>
+                  </object>
+                  <packing>
+                    <property name="left_attach">1</property>
+                    <property name="top_attach">2</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkCheckButton" id="aiConfirmationCheckButton">
+                    <property name="label" translatable="yes">Require confirmation before AI actions</property>
+                    <property name="visible">True</property>
+                    <property name="can_focus">True</property>
+                    <property name="receives_default">False</property>
+                    <property name="use_underline">True</property>
+                    <property name="draw_indicator">True</property>
+                    <property name="active">True</property>
+                    <signal name="toggled" handler="aiConfirmationToggled" swapped="no"/>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">3</property>
+                    <property name="width">2</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkLabel" id="aiOllamaModelLabel">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="halign">start</property>
+                    <property name="label" translatable="yes">Ollama _Model:</property>
+                    <property name="use_underline">True</property>
+                    <property name="mnemonic_widget">aiOllamaModelEntry</property>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">4</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkEntry" id="aiOllamaModelEntry">
+                    <property name="visible">True</property>
+                    <property name="can_focus">True</property>
+                    <property name="hexpand">True</property>
+                    <property name="text">llama3.2-vision</property>
+                    <property name="placeholder_text" translatable="yes">Model name for Ollama (e.g., llama3.2-vision)</property>
+                    <signal name="changed" handler="aiOllamaModelChanged" swapped="no"/>
+                  </object>
+                  <packing>
+                    <property name="left_attach">1</property>
+                    <property name="top_attach">4</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkLabel" id="aiScreenshotQualityLabel">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="halign">start</property>
+                    <property name="label" translatable="yes">Screenshot _Quality:</property>
+                    <property name="use_underline">True</property>
+                    <property name="mnemonic_widget">aiScreenshotQualityCombo</property>
+                  </object>
+                  <packing>
+                    <property name="left_attach">0</property>
+                    <property name="top_attach">5</property>
+                  </packing>
+                </child>
+                <child>
+                  <object class="GtkComboBoxText" id="aiScreenshotQualityCombo">
+                    <property name="visible">True</property>
+                    <property name="can_focus">False</property>
+                    <property name="hexpand">True</property>
+                    <property name="active">1</property>
+                    <items>
+                      <item translatable="yes">Low</item>
+                      <item translatable="yes">Medium</item>
+                      <item translatable="yes">High</item>
+                    </items>
+                    <signal name="changed" handler="aiScreenshotQualityChanged" swapped="no"/>
+                  </object>
+                  <packing>
+                    <property name="left_attach">1</property>
+                    <property name="top_attach">5</property>
+                  </packing>
+                </child>
+              </object>
+              <packing>
+                <property name="position">8</property>
+              </packing>
+            </child>
+            <child type="tab">
+              <object class="GtkLabel" id="aiTabLabel">
+                <property name="visible">True</property>
+                <property name="can_focus">False</property>
+                <property name="label" translatable="yes">AI Assistant</property>
+              </object>
+              <packing>
+                <property name="position">8</property>
+                <property name="tab_fill">False</property>
+              </packing>
+            </child>
          </object>
          <packing>
            <property name="expand">True</property>
@@ -23,5 +23,5 @@
 # Fork of Orca Screen Reader (GNOME)
 # Original source: https://gitlab.gnome.org/GNOME/orca

-version = "2025.08.02"
+version = "2025.08.03"
 codeName = "testing"
@@ -1815,6 +1815,10 @@ class CthulhuSetupGUI(cthulhu_gtkbuilder.GtkBuilderWrapper):
        self.__initProfileCombo()
        if self.script.app:
            self.get_widget('profilesFrame').set_sensitive(False)
+            
+        # AI Assistant settings
+        #
+        self._initAIState()

    def __initProfileCombo(self):
        """Adding available profiles and setting active as the active one"""
@@ -1842,6 +1846,66 @@ class CthulhuSetupGUI(cthulhu_gtkbuilder.GtkBuilderWrapper):
        """Get available user profiles."""
        return _settingsManager.availableProfiles()

+    def _initAIState(self):
+        """Initialize AI Assistant tab widgets with current settings."""
+        prefs = self.prefsDict
+        
+        # Store widget references
+        self.enableAICheckButton = self.get_widget("enableAICheckButton")
+        self.aiProviderCombo = self.get_widget("aiProviderCombo")
+        self.aiApiKeyEntry = self.get_widget("aiApiKeyEntry")
+        self.aiOllamaModelEntry = self.get_widget("aiOllamaModelEntry")
+        self.aiConfirmationCheckButton = self.get_widget("aiConfirmationCheckButton")
+        self.aiScreenshotQualityCombo = self.get_widget("aiScreenshotQualityCombo")
+        
+        # Set enable AI checkbox
+        enabled = prefs.get("aiAssistantEnabled", settings.aiAssistantEnabled)
+        self.enableAICheckButton.set_active(enabled)
+        
+        # Set provider combo
+        provider = prefs.get("aiProvider", settings.aiProvider)
+        providerIndex = 0  # Default to Claude
+        if provider == settings.AI_PROVIDER_CHATGPT:
+            providerIndex = 1
+        elif provider == settings.AI_PROVIDER_GEMINI:
+            providerIndex = 2
+        elif provider == settings.AI_PROVIDER_OLLAMA:
+            providerIndex = 3
+        self.aiProviderCombo.set_active(providerIndex)
+        
+        # Set API key file
+        apiKeyFile = prefs.get("aiApiKeyFile", settings.aiApiKeyFile)
+        self.aiApiKeyEntry.set_text(apiKeyFile)
+        
+        # Set Ollama model
+        ollamaModel = prefs.get("aiOllamaModel", settings.aiOllamaModel)
+        self.aiOllamaModelEntry.set_text(ollamaModel)
+        
+        # Set confirmation checkbox
+        confirmationRequired = prefs.get("aiConfirmationRequired", settings.aiConfirmationRequired)
+        self.aiConfirmationCheckButton.set_active(confirmationRequired)
+        
+        # Set screenshot quality combo
+        quality = prefs.get("aiScreenshotQuality", settings.aiScreenshotQuality)
+        qualityIndex = 1  # Default to medium
+        if quality == settings.AI_SCREENSHOT_QUALITY_LOW:
+            qualityIndex = 0
+        elif quality == settings.AI_SCREENSHOT_QUALITY_HIGH:
+            qualityIndex = 2
+        self.aiScreenshotQualityCombo.set_active(qualityIndex)
+        
+        # Enable/disable controls based on AI enabled state
+        self._updateAIControlsState(enabled)
+        
+    def _updateAIControlsState(self, enabled):
+        """Enable or disable AI controls based on AI enabled state."""
+        self.aiProviderCombo.set_sensitive(enabled)
+        self.aiApiKeyEntry.set_sensitive(enabled)
+        self.aiOllamaModelEntry.set_sensitive(enabled)
+        self.aiConfirmationCheckButton.set_sensitive(enabled)
+        self.aiScreenshotQualityCombo.set_sensitive(enabled)
+        self.get_widget("aiApiKeyBrowseButton").set_sensitive(enabled)
+
    def _updateCthulhuModifier(self):
        combobox = self.get_widget("cthulhuModifierComboBox")
        keystring = ", ".join(self.prefsDict["cthulhuModifierKeys"])
@@ -3573,4 +3637,85 @@ class CthulhuSetupGUI(cthulhu_gtkbuilder.GtkBuilderWrapper):
        self._populateKeyBindings()

        self.__initProfileCombo()
+        
+    # AI Assistant signal handlers
+    
+    def enableAIToggled(self, widget):
+        """Enable AI Assistant checkbox toggled handler"""
+        enabled = widget.get_active()
+        self.prefsDict["aiAssistantEnabled"] = enabled
+        self._updateAIControlsState(enabled)
+        
+        # Auto-enable/disable the AIAssistant plugin based on preference
+        self._updateAIPluginState(enabled)
+    
+    def aiProviderChanged(self, widget):
+        """AI Provider combo box changed handler"""
+        providers = [settings.AI_PROVIDER_CLAUDE, settings.AI_PROVIDER_CHATGPT, 
+                    settings.AI_PROVIDER_GEMINI, settings.AI_PROVIDER_OLLAMA]
+        activeIndex = widget.get_active()
+        if 0 <= activeIndex < len(providers):
+            self.prefsDict["aiProvider"] = providers[activeIndex]
+    
+    def aiApiKeyChanged(self, widget):
+        """AI API key file entry changed handler"""
+        self.prefsDict["aiApiKeyFile"] = widget.get_text()
+    
+    def aiOllamaModelChanged(self, widget):
+        """AI Ollama model entry changed handler"""
+        self.prefsDict["aiOllamaModel"] = widget.get_text()
+    
+    def aiApiKeyBrowseClicked(self, widget):
+        """AI API key browse button clicked handler"""
+        dialog = Gtk.FileChooserDialog(
+            title="Select API Key File",
+            parent=self,
+            action=Gtk.FileChooserAction.OPEN
+        )
+        dialog.add_buttons(
+            Gtk.STOCK_CANCEL, Gtk.ResponseType.CANCEL,
+            Gtk.STOCK_OPEN, Gtk.ResponseType.OK
+        )
+        
+        response = dialog.run()
+        if response == Gtk.ResponseType.OK:
+            filename = dialog.get_filename()
+            self.aiApiKeyEntry.set_text(filename)
+            self.prefsDict["aiApiKeyFile"] = filename
+        
+        dialog.destroy()
+    
+    def aiConfirmationToggled(self, widget):
+        """AI confirmation required checkbox toggled handler"""
+        self.prefsDict["aiConfirmationRequired"] = widget.get_active()
+    
+    def aiScreenshotQualityChanged(self, widget):
+        """AI screenshot quality combo box changed handler"""
+        qualities = [settings.AI_SCREENSHOT_QUALITY_LOW, 
+                    settings.AI_SCREENSHOT_QUALITY_MEDIUM,
+                    settings.AI_SCREENSHOT_QUALITY_HIGH]
+        activeIndex = widget.get_active()
+        if 0 <= activeIndex < len(qualities):
+            self.prefsDict["aiScreenshotQuality"] = qualities[activeIndex]
+    
+    def _updateAIPluginState(self, enabled):
+        """Enable or disable the AIAssistant plugin in activePlugins list."""
+        try:
+            activePlugins = self.prefsDict.get("activePlugins", settings.activePlugins[:])
+            
+            if enabled:
+                # Add AIAssistant to active plugins if not already there
+                if "AIAssistant" not in activePlugins:
+                    activePlugins.insert(0, "AIAssistant")  # Add at beginning for priority
+                    self.prefsDict["activePlugins"] = activePlugins
+                    print(f"DEBUG: Added AIAssistant to activePlugins: {activePlugins}")
+            else:
+                # Remove AIAssistant from active plugins
+                if "AIAssistant" in activePlugins:
+                    activePlugins.remove("AIAssistant")
+                    self.prefsDict["activePlugins"] = activePlugins
+                    print(f"DEBUG: Removed AIAssistant from activePlugins: {activePlugins}")
+                    
+        except Exception as e:
+            print(f"DEBUG: Error updating AI plugin state: {e}")

@@ -0,0 +1,7 @@
+cthulhu_python_PYTHON = \
+	__init__.py \
+	plugin.info \
+	plugin.py \
+	ai_providers.py
+
+cthulhu_pythondir=$(pkgpythondir)/plugins/AIAssistant
@@ -0,0 +1,22 @@
+#!/usr/bin/env python3
+#
+# Copyright (c) 2025 Stormux
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the
+# Free Software Foundation, Inc., Franklin Street, Fifth Floor,
+# Boston MA  02110-1301 USA.
+
+"""AI Assistant plugin package."""
+
+from .plugin import AIAssistant
@@ -0,0 +1,285 @@
+#!/usr/bin/env python3
+#
+# Copyright (c) 2025 Stormux
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+
+"""AI providers for the AI Assistant plugin."""
+
+import logging
+import json
+import requests
+from abc import ABC, abstractmethod
+
+logger = logging.getLogger(__name__)
+
+class AIProvider(ABC):
+    """Abstract base class for AI providers."""
+    
+    def __init__(self, api_key=None, model=None, **kwargs):
+        self.api_key = api_key
+        self.model = model
+        self.kwargs = kwargs
+        
+    @abstractmethod
+    def describe_screen(self, screenshot_data, accessibility_data):
+        """Generate a description of the current screen."""
+        pass
+        
+    @abstractmethod
+    def answer_question(self, question, screenshot_data, accessibility_data):
+        """Answer a question about the current screen/focus."""
+        pass
+        
+    @abstractmethod
+    def suggest_actions(self, request, screenshot_data, accessibility_data):
+        """Suggest actions to accomplish a user's request."""
+        pass
+        
+    def _prepare_system_prompt(self, task_type):
+        """Prepare system prompt based on task type."""
+        base_prompt = """You are an AI assistant helping a screen reader user navigate and interact with computer applications. You have access to:
+
+1. A screenshot of the current screen
+2. Detailed accessibility tree information about UI elements
+3. Information about the currently focused element
+
+The user is using the Cthulhu screen reader, so they cannot see the screen visually. Your responses should be clear, concise, and focused on accessibility.
+
+"""
+        
+        if task_type == "describe":
+            return base_prompt + """Your task: Provide a clear, structured description of what's on the screen. Focus on:
+- Main UI elements and their layout
+- Current focus location
+- Available actions and navigation options
+- Any important visual information not captured in accessibility data
+
+Keep descriptions concise but informative."""
+
+        elif task_type == "question":
+            return base_prompt + """Your task: Answer the user's question about the current screen or focused element. Use both the screenshot and accessibility data to provide accurate, helpful information.
+
+Be specific and actionable in your responses."""
+
+        elif task_type == "action":
+            return base_prompt + """Your task: Analyze the user's action request and suggest specific steps to accomplish it. Consider:
+- Current focus and context
+- Available UI elements that can accomplish the task
+- Safest and most efficient approach
+- Any potential risks or confirmations needed
+
+Provide step-by-step instructions that can be executed via accessibility APIs."""
+
+        return base_prompt
+
+
+class ClaudeProvider(AIProvider):
+    """Claude AI provider using Anthropic's API."""
+    
+    def __init__(self, api_key, model="claude-3-5-sonnet-20241022", **kwargs):
+        super().__init__(api_key, model, **kwargs)
+        self.base_url = "https://api.anthropic.com/v1/messages"
+        self.headers = {
+            "Content-Type": "application/json",
+            "X-API-Key": self.api_key,
+            "anthropic-version": "2023-06-01"
+        }
+        
+    def describe_screen(self, screenshot_data, accessibility_data):
+        """Generate a description using Claude."""
+        try:
+            prompt = self._build_prompt("describe", None, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Claude describe error: {e}")
+            return f"Error getting screen description: {e}"
+            
+    def answer_question(self, question, screenshot_data, accessibility_data):
+        """Answer a question using Claude."""
+        try:
+            prompt = self._build_prompt("question", question, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Claude question error: {e}")
+            return f"Error answering question: {e}"
+            
+    def suggest_actions(self, request, screenshot_data, accessibility_data):
+        """Suggest actions using Claude."""
+        try:
+            prompt = self._build_prompt("action", request, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Claude action error: {e}")
+            return f"Error suggesting actions: {e}"
+            
+    def _build_prompt(self, task_type, user_input, accessibility_data):
+        """Build the complete prompt for Claude."""
+        prompt = f"Current accessibility information:\n```json\n{json.dumps(accessibility_data, indent=2)}\n```\n\n"
+        
+        if task_type == "describe":
+            prompt += "Please describe what's on this screen."
+        elif task_type == "question":
+            prompt += f"User question: {user_input}"
+        elif task_type == "action":
+            prompt += f"User wants to: {user_input}\n\nPlease suggest specific steps to accomplish this."
+            
+        return prompt
+        
+    def _make_request(self, prompt, screenshot_data):
+        """Make request to Claude API."""
+        try:
+            # Prepare the message content
+            content = [
+                {
+                    "type": "text",
+                    "text": prompt
+                }
+            ]
+            
+            # Add screenshot if available
+            if screenshot_data:
+                content.append({
+                    "type": "image",
+                    "source": {
+                        "type": "base64",
+                        "media_type": f"image/{screenshot_data['format']}",
+                        "data": screenshot_data['data']
+                    }
+                })
+            
+            payload = {
+                "model": self.model,
+                "max_tokens": 1000,
+                "messages": [
+                    {
+                        "role": "user",
+                        "content": content
+                    }
+                ],
+                "system": self._prepare_system_prompt("describe")  # Will be made dynamic later
+            }
+            
+            response = requests.post(
+                self.base_url,
+                headers=self.headers,
+                json=payload,
+                timeout=30
+            )
+            
+            if response.status_code == 200:
+                result = response.json()
+                return result['content'][0]['text']
+            else:
+                error_msg = f"Claude API error {response.status_code}: {response.text}"
+                logger.error(error_msg)
+                return error_msg
+                
+        except requests.RequestException as e:
+            error_msg = f"Network error contacting Claude: {e}"
+            logger.error(error_msg)
+            return error_msg
+        except Exception as e:
+            error_msg = f"Unexpected error with Claude API: {e}"
+            logger.error(error_msg)
+            return error_msg
+
+
+class OllamaProvider(AIProvider):
+    """Ollama local AI provider."""
+    
+    def __init__(self, model="llama3.2-vision", base_url="http://localhost:11434", **kwargs):
+        super().__init__(model=model, **kwargs)
+        self.base_url = base_url
+        
+    def describe_screen(self, screenshot_data, accessibility_data):
+        """Generate a description using Ollama."""
+        try:
+            prompt = self._build_prompt("describe", None, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Ollama describe error: {e}")
+            return f"Error getting screen description: {e}"
+            
+    def answer_question(self, question, screenshot_data, accessibility_data):
+        """Answer a question using Ollama."""
+        try:
+            prompt = self._build_prompt("question", question, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Ollama question error: {e}")
+            return f"Error answering question: {e}"
+            
+    def suggest_actions(self, request, screenshot_data, accessibility_data):
+        """Suggest actions using Ollama."""
+        try:
+            prompt = self._build_prompt("action", request, accessibility_data)
+            return self._make_request(prompt, screenshot_data)
+        except Exception as e:
+            logger.error(f"Ollama action error: {e}")
+            return f"Error suggesting actions: {e}"
+            
+    def _build_prompt(self, task_type, user_input, accessibility_data):
+        """Build the complete prompt for Ollama."""
+        system_prompt = self._prepare_system_prompt(task_type)
+        
+        prompt = f"{system_prompt}\n\nCurrent accessibility information:\n```json\n{json.dumps(accessibility_data, indent=2)}\n```\n\n"
+        
+        if task_type == "describe":
+            prompt += "Please describe what's on this screen."
+        elif task_type == "question":
+            prompt += f"User question: {user_input}"
+        elif task_type == "action":
+            prompt += f"User wants to: {user_input}\n\nPlease suggest specific steps to accomplish this."
+            
+        return prompt
+        
+    def _make_request(self, prompt, screenshot_data):
+        """Make request to Ollama API."""
+        try:
+            # For Ollama, we'll use the generate endpoint
+            payload = {
+                "model": self.model,
+                "prompt": prompt,
+                "stream": False
+            }
+            
+            # Note: Ollama vision support varies by model
+            # For now, we'll send text-only requests
+            # TODO: Add image support when Ollama vision models are more stable
+            
+            response = requests.post(
+                f"{self.base_url}/api/generate",
+                json=payload,
+                timeout=60  # Ollama can be slower
+            )
+            
+            if response.status_code == 200:
+                result = response.json()
+                return result.get('response', 'No response from Ollama')
+            else:
+                error_msg = f"Ollama API error {response.status_code}: {response.text}"
+                logger.error(error_msg)
+                return error_msg
+                
+        except requests.RequestException as e:
+            error_msg = f"Network error contacting Ollama: {e}"
+            logger.error(error_msg)
+            return error_msg
+        except Exception as e:
+            error_msg = f"Unexpected error with Ollama API: {e}"
+            logger.error(error_msg)
+            return error_msg
+
+
+def create_provider(provider_type, **kwargs):
+    """Factory function to create AI providers."""
+    if provider_type == "claude":
+        return ClaudeProvider(**kwargs)
+    elif provider_type == "ollama":
+        return OllamaProvider(**kwargs)
+    else:
+        raise ValueError(f"Unknown provider type: {provider_type}")
@@ -0,0 +1,8 @@
+name = AI Assistant
+version = 1.0.0
+description = AI-powered accessibility assistant for analyzing screens and taking actions
+authors = Stormux <storm_dragon@stormux.org>
+website = https://stormux.org
+copyright = Copyright 2025
+builtin = false
+hidden = false
@@ -0,0 +1,727 @@
+#!/usr/bin/env python3
+#
+# Copyright (c) 2025 Stormux
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+
+"""AI Assistant plugin for Cthulhu screen reader."""
+
+import logging
+import os
+import json
+import base64
+from io import BytesIO
+
+import gi
+gi.require_version('Gdk', '3.0')
+gi.require_version('GdkPixbuf', '2.0')
+gi.require_version('Atspi', '2.0')
+gi.require_version('Gtk', '3.0')
+from gi.repository import Gdk, GdkPixbuf, Atspi, Gtk
+
+from cthulhu.plugin import Plugin, cthulhu_hookimpl
+from cthulhu import settings
+from cthulhu import settings_manager
+from cthulhu import cthulhu_state
+from cthulhu import ax_object
+from cthulhu import ax_utilities
+from cthulhu.ax_utilities_state import AXUtilitiesState
+from cthulhu.plugins.AIAssistant.ai_providers import create_provider
+
+logger = logging.getLogger(__name__)
+
+class AIAssistant(Plugin):
+    """AI-powered accessibility assistant plugin.
+    
+    Provides AI-enhanced accessibility features including:
+    - Screen analysis using screenshots and AT-SPI data
+    - Natural language queries about UI elements
+    - Safe action assistance with user confirmation
+    - Multi-provider AI support (Claude, ChatGPT, Gemini, Ollama)
+    """
+    
+    def __init__(self, *args, **kwargs):
+        """Initialize the AI Assistant plugin."""
+        super().__init__(*args, **kwargs)
+        logger.info("AI Assistant plugin initialized")
+        print("DEBUG: AI Assistant plugin __init__ called")
+        
+        # Write to a debug file so we can see if the plugin is being loaded
+        try:
+            with open('/tmp/ai_assistant_debug.log', 'a') as f:
+                f.write("AI Assistant plugin __init__ called\n")
+        except:
+            pass
+        
+        # Keybinding storage
+        self._kb_binding_activate = None
+        self._kb_binding_question = None
+        self._kb_binding_describe = None
+        
+        # AI provider and settings
+        self._provider_type = None
+        self._ai_provider = None
+        self._api_key = None
+        self._ollama_model = None
+        self._settings_manager = settings_manager.getManager()
+        
+        # Plugin enabled state
+        self._enabled = False
+        
+        # Pre-captured screen data (to avoid capturing dialog itself)
+        self._current_screen_data = None
+        
+    @cthulhu_hookimpl
+    def activate(self, plugin=None):
+        """Activate the AI Assistant plugin."""
+        if plugin is not None and plugin is not self:
+            return
+            
+        try:
+            logger.info("=== AI Assistant plugin activation starting ===")
+            print("DEBUG: AI Assistant plugin activation starting")
+            
+            # Check if AI Assistant is enabled in settings
+            enabled = self._settings_manager.getSetting('aiAssistantEnabled')
+            print(f"DEBUG: AI Assistant enabled setting: {enabled}")
+            if not enabled:
+                logger.info("AI Assistant is disabled in settings, skipping activation")
+                print("DEBUG: AI Assistant is disabled in settings, skipping activation")
+                return
+                
+            # Load AI settings
+            self._load_ai_settings()
+            
+            # Check if we have valid configuration
+            if not self._validate_configuration():
+                logger.warning("AI Assistant configuration invalid, skipping activation")
+                return
+                
+            # Initialize AI provider
+            self._initialize_ai_provider()
+                
+            # Register keybindings only if configuration is valid
+            self._register_keybindings()
+            
+            self._enabled = True
+            logger.info("AI Assistant plugin activated successfully")
+            print("DEBUG: AI Assistant plugin activated successfully")
+            
+        except Exception as e:
+            logger.error(f"Error activating AI Assistant plugin: {e}")
+            import traceback
+            logger.error(traceback.format_exc())
+    
+    @cthulhu_hookimpl
+    def deactivate(self, plugin=None):
+        """Deactivate the AI Assistant plugin."""
+        if plugin is not None and plugin is not self:
+            return
+            
+        logger.info("Deactivating AI Assistant plugin")
+        
+        # Unregister keybindings
+        self._unregister_keybindings()
+        
+        self._enabled = False
+        
+    def _load_ai_settings(self):
+        """Load AI Assistant settings from Cthulhu configuration."""
+        try:
+            # Get provider
+            provider = self._settings_manager.getSetting('aiProvider')
+            self._provider_type = provider or settings.AI_PROVIDER_CLAUDE
+            
+            # Load API key from file
+            api_key_file = self._settings_manager.getSetting('aiApiKeyFile')
+            if api_key_file and os.path.isfile(api_key_file):
+                with open(api_key_file, 'r') as f:
+                    self._api_key = f.read().strip()
+            else:
+                self._api_key = None
+                
+            # Load Ollama model
+            self._ollama_model = self._settings_manager.getSetting('aiOllamaModel')
+            if not self._ollama_model:
+                self._ollama_model = settings.aiOllamaModel
+                
+            logger.info(f"AI settings loaded: provider={self._provider_type}, "
+                       f"api_key_configured={bool(self._api_key)}, "
+                       f"ollama_model={self._ollama_model}")
+                       
+        except Exception as e:
+            logger.error(f"Error loading AI settings: {e}")
+            
+    def _validate_configuration(self):
+        """Validate AI Assistant configuration."""
+        if not self._provider_type:
+            logger.warning("No AI provider configured")
+            return False
+            
+        # Ollama doesn't need an API key
+        if self._provider_type == settings.AI_PROVIDER_OLLAMA:
+            return self._check_ollama_availability()
+            
+        # Other providers need API keys
+        if not self._api_key:
+            logger.warning(f"No API key configured for provider {self._provider_type}")
+            return False
+            
+        return True
+        
+    def _check_ollama_availability(self):
+        """Check if Ollama is available and has vision models."""
+        try:
+            import requests
+            # Check if Ollama is running
+            response = requests.get('http://localhost:11434/api/version', timeout=5)
+            if response.status_code == 200:
+                logger.info("Ollama service is available")
+                return True
+            else:
+                logger.warning("Ollama service not responding")
+                return False
+        except Exception as e:
+            logger.warning(f"Ollama not available: {e}")
+            return False
+            
+    def _initialize_ai_provider(self):
+        """Initialize the AI provider based on settings."""
+        try:
+            if self._provider_type == settings.AI_PROVIDER_CLAUDE:
+                self._ai_provider = create_provider("claude", api_key=self._api_key)
+            elif self._provider_type == settings.AI_PROVIDER_OLLAMA:
+                self._ai_provider = create_provider("ollama", model=self._ollama_model)
+            else:
+                logger.error(f"Unsupported provider type: {self._provider_type}")
+                return False
+                
+            logger.info(f"AI provider initialized: {self._provider_type}")
+            return True
+            
+        except Exception as e:
+            logger.error(f"Error initializing AI provider: {e}")
+            return False
+            
+    def _register_keybindings(self):
+        """Register AI Assistant keybindings."""
+        try:
+            # Main AI Assistant activation - avoid conflict with Actions
+            self._kb_binding_activate = self.registerGestureByString(
+                self._handle_ai_activate,
+                "Activate AI Assistant",
+                'kb:cthulhu+control+shift+a'
+            )
+            
+            # Ask question about current focus
+            self._kb_binding_question = self.registerGestureByString(
+                self._handle_ai_question,
+                "Ask AI about current focus",
+                'kb:cthulhu+control+shift+q'
+            )
+            
+            # Describe current screen
+            self._kb_binding_describe = self.registerGestureByString(
+                self._handle_ai_describe,
+                "AI describe current screen",
+                'kb:cthulhu+control+shift+d'
+            )
+            
+            logger.info("AI Assistant keybindings registered")
+            print(f"DEBUG: AI Assistant keybindings registered - activate: {self._kb_binding_activate}, question: {self._kb_binding_question}, describe: {self._kb_binding_describe}")
+            
+        except Exception as e:
+            logger.error(f"Error registering AI keybindings: {e}")
+            
+    def _unregister_keybindings(self):
+        """Unregister AI Assistant keybindings."""
+        # Keybindings are automatically cleaned up when plugin deactivates
+        self._kb_binding_activate = None
+        self._kb_binding_question = None
+        self._kb_binding_describe = None
+        
+    def _handle_ai_activate(self, script=None, inputEvent=None):
+        """Handle main AI Assistant activation."""
+        try:
+            logger.info("AI Assistant activation requested")
+            print("DEBUG: AI Assistant activation keybinding triggered!")
+            
+            if not self._enabled:
+                print("DEBUG: AI Assistant not enabled, presenting message")
+                self._present_message("AI Assistant is not enabled")
+                return True
+                
+            # For now, just show status until Phase 5 adds the action interface
+            if self._ai_provider:
+                provider_name = self._provider_type.title()
+                self._present_message(f"AI Assistant ready using {provider_name}. Press D to describe screen, Q to ask questions.")
+            else:
+                self._present_message("AI Assistant not properly configured. Check settings.")
+                
+            return True
+            
+        except Exception as e:
+            logger.error(f"Error in AI activate handler: {e}")
+            return False
+            
+    def _handle_ai_question(self, script=None, inputEvent=None):
+        """Handle AI question request."""
+        try:
+            logger.info("AI question requested")
+            
+            if not self._enabled:
+                self._present_message("AI Assistant is not enabled")
+                return True
+                
+            if not self._ai_provider:
+                self._present_message("AI provider not available. Check configuration.")
+                return True
+                
+            # IMPORTANT: Collect screen data BEFORE opening dialog
+            # This captures the actual window the user is asking about
+            self._present_message("AI Assistant capturing screen data...")
+            self._current_screen_data = self._collect_ai_data()
+            
+            if not self._current_screen_data:
+                self._present_message("Could not collect screen data for analysis")
+                return True
+                
+            # Now show question dialog
+            self._show_question_dialog()
+                
+            return True
+            
+        except Exception as e:
+            logger.error(f"Error in AI question handler: {e}")
+            return False
+            
+    def _handle_ai_describe(self, script=None, inputEvent=None):
+        """Handle AI screen description request."""
+        try:
+            logger.info("AI screen description requested")
+            
+            if not self._enabled:
+                self._present_message("AI Assistant is not enabled")
+                return True
+                
+            # Use AI to describe the current screen
+            if not self._ai_provider:
+                self._present_message("AI provider not available. Check configuration.")
+                return True
+                
+            self._present_message("AI Assistant analyzing screen...")
+            
+            # Collect data and get AI description
+            data = self._collect_ai_data()
+            if data:
+                try:
+                    response = self._ai_provider.describe_screen(
+                        data.get('screenshot'), 
+                        data.get('accessibility')
+                    )
+                    self._present_message(response)
+                except Exception as e:
+                    logger.error(f"Error getting AI screen description: {e}")
+                    self._present_message(f"Error getting AI screen description: {e}")
+            else:
+                self._present_message("Could not collect screen data for analysis")
+                
+            return True
+            
+        except Exception as e:
+            logger.error(f"Error in AI describe handler: {e}")
+            return False
+            
+    def _present_message(self, message):
+        """Present a message to the user via speech."""
+        try:
+            if self.app:
+                state = self.app.getDynamicApiManager().getAPI('CthulhuState')
+                if state and state.activeScript:
+                    state.activeScript.presentMessage(message, resetStyles=False)
+                else:
+                    logger.warning("No active script available for message presentation")
+            else:
+                logger.warning("No app reference available for message presentation")
+        except Exception as e:
+            logger.error(f"Error presenting message: {e}")
+            
+    def _capture_screenshot(self):
+        """Capture a screenshot of the current display."""
+        try:
+            # Get the default display and root window
+            display = Gdk.Display.get_default()
+            if not display:
+                logger.error("No display available for screenshot")
+                return None
+                
+            screen = display.get_default_screen()
+            root_window = screen.get_root_window()
+            
+            # Get screen dimensions
+            width = screen.get_width()
+            height = screen.get_height()
+            
+            # Capture the screenshot
+            pixbuf = Gdk.pixbuf_get_from_window(root_window, 0, 0, width, height)
+            
+            if not pixbuf:
+                logger.error("Failed to capture screenshot")
+                return None
+                
+            # Convert to base64 for AI transmission
+            success, buffer = pixbuf.save_to_bufferv("png", [], [])
+            if not success:
+                logger.error("Failed to save pixbuf to buffer")
+                return None
+            image_data = base64.b64encode(buffer).decode('utf-8')
+            
+            logger.info(f"Screenshot captured: {width}x{height}")
+            return {
+                'format': 'png',
+                'width': width,
+                'height': height,
+                'data': image_data
+            }
+            
+        except Exception as e:
+            logger.error(f"Error capturing screenshot: {e}")
+            return None
+            
+    def _get_accessibility_tree(self):
+        """Get accessibility tree information for the current focus."""
+        try:
+            # Get the current focus object
+            focus_obj = cthulhu_state.locusOfFocus
+            if not focus_obj:
+                logger.warning("No focus object available")
+                return None
+                
+            # Collect accessibility information
+            tree_data = {
+                'focus': self._serialize_ax_object(focus_obj),
+                'context': []
+            }
+            
+            # Get parent context (up to 3 levels)
+            parent = ax_object.AXObject.get_parent(focus_obj)
+            level = 0
+            while parent and level < 3:
+                tree_data['context'].append(self._serialize_ax_object(parent))
+                parent = ax_object.AXObject.get_parent(parent)
+                level += 1
+                
+            # Get children of focus (if any) 
+            child_count = ax_object.AXObject.get_child_count(focus_obj)
+            if child_count > 0:
+                children = []
+                for i in range(min(child_count, 10)):  # Limit to first 10
+                    child = ax_object.AXObject.get_child(focus_obj, i)
+                    if child:
+                        children.append(self._serialize_ax_object(child))
+                if children:
+                    tree_data['children'] = children
+                
+            logger.info(f"Accessibility tree collected for {ax_object.AXObject.get_name(focus_obj) or 'unnamed object'}")
+            return tree_data
+            
+        except Exception as e:
+            logger.error(f"Error getting accessibility tree: {e}")
+            return None
+            
+    def _serialize_ax_object(self, obj):
+        """Serialize an accessibility object to JSON-compatible format."""
+        try:
+            if not obj:
+                return None
+                
+            return {
+                'name': ax_object.AXObject.get_name(obj) or '',
+                'role': ax_object.AXObject.get_role_name(obj) or '',
+                'description': ax_object.AXObject.get_description(obj) or '',
+                'text': self._get_object_text(obj),
+                'value': self._get_object_value(obj),
+                'states': self._get_object_states(obj),
+                'attributes': self._get_object_attributes(obj),
+                'position': self._get_object_position(obj)
+            }
+            
+        except Exception as e:
+            logger.error(f"Error serializing accessibility object: {e}")
+            return None
+            
+    def _get_object_text(self, obj):
+        """Get text content from an accessibility object."""
+        try:
+            # Use script utilities to get displayed text if available
+            if cthulhu_state.activeScript and hasattr(cthulhu_state.activeScript, 'utilities'):
+                try:
+                    text = cthulhu_state.activeScript.utilities.displayedText(obj)
+                    if text:
+                        return text.strip()
+                except:
+                    pass
+                    
+            # Fallback: try direct AT-SPI text interface
+            try:
+                if ax_object.AXObject.supports_text(obj):
+                    text_iface = obj.queryText()
+                    if text_iface:
+                        text = text_iface.getText(0, -1)
+                        if text:
+                            return text.strip()
+            except:
+                pass
+                
+            return ""
+            
+        except Exception as e:
+            logger.error(f"Error getting object text: {e}")
+            return ""
+            
+    def _get_object_value(self, obj):
+        """Get value from an accessibility object."""
+        try:
+            if ax_object.AXObject.supports_value(obj):
+                try:
+                    value_iface = obj.queryValue()
+                    if value_iface:
+                        return str(value_iface.currentValue) or ""
+                except:
+                    pass
+            return ""
+            
+        except Exception as e:
+            logger.error(f"Error getting object value: {e}")
+            return ""
+            
+    def _get_object_states(self, obj):
+        """Get state information from an accessibility object."""
+        try:
+            states = []
+            if AXUtilitiesState.is_focused(obj):
+                states.append("focused")
+            if AXUtilitiesState.is_selected(obj):
+                states.append("selected")
+            if AXUtilitiesState.is_expanded(obj):
+                states.append("expanded")
+            if AXUtilitiesState.is_checked(obj):
+                states.append("checked")
+            if AXUtilitiesState.is_sensitive(obj):
+                states.append("sensitive")
+            if AXUtilitiesState.is_showing(obj):
+                states.append("showing")
+            if AXUtilitiesState.is_visible(obj):
+                states.append("visible")
+                
+            return states
+            
+        except Exception as e:
+            logger.error(f"Error getting object states: {e}")
+            return []
+            
+    def _get_object_attributes(self, obj):
+        """Get attributes from an accessibility object."""
+        try:
+            attrs = {}
+            
+            # Get object attributes from AT-SPI
+            try:
+                if hasattr(obj, 'get_attributes'):
+                    obj_attrs = obj.get_attributes()
+                    if obj_attrs:
+                        attrs['object_attributes'] = dict(obj_attrs)
+            except:
+                pass
+                
+            return attrs
+            
+        except Exception as e:
+            logger.error(f"Error getting object attributes: {e}")
+            return {}
+            
+    def _get_object_position(self, obj):
+        """Get position and size information from an accessibility object."""
+        try:
+            if hasattr(obj, 'queryComponent'):
+                component = obj.queryComponent()
+                if component:
+                    extents = component.getExtents(Atspi.CoordType.SCREEN)
+                    return {
+                        'x': extents.x,
+                        'y': extents.y,
+                        'width': extents.width,
+                        'height': extents.height
+                    }
+            return None
+            
+        except Exception as e:
+            logger.error(f"Error getting object position: {e}")
+            return None
+            
+    def _collect_ai_data(self):
+        """Collect both screenshot and accessibility data for AI analysis."""
+        try:
+            logger.info("Collecting AI data (screenshot + accessibility tree)")
+            
+            # Collect both types of data
+            screenshot = self._capture_screenshot()
+            accessibility_tree = self._get_accessibility_tree()
+            
+            data = {
+                'timestamp': __import__('time').time(),
+                'screenshot': screenshot,
+                'accessibility': accessibility_tree
+            }
+            
+            # Add current application context
+            if cthulhu_state.activeScript:
+                app_name = getattr(cthulhu_state.activeScript, 'name', 'unknown')
+                data['application'] = app_name
+                
+            logger.info("AI data collection completed")
+            return data
+            
+        except Exception as e:
+            logger.error(f"Error collecting AI data: {e}")
+            return None
+            
+    def _show_question_dialog(self):
+        """Show a dialog for the user to enter their question."""
+        try:
+            dialog = Gtk.Dialog(
+                title="AI Assistant Question",
+                parent=None,
+                flags=Gtk.DialogFlags.MODAL,
+                buttons=(
+                    Gtk.STOCK_CANCEL, Gtk.ResponseType.CANCEL,
+                    Gtk.STOCK_OK, Gtk.ResponseType.OK
+                )
+            )
+            
+            dialog.set_default_size(500, 200)
+            
+            # Create the question entry
+            content_area = dialog.get_content_area()
+            
+            label = Gtk.Label(label="Enter your question about the current screen:")
+            label.set_halign(Gtk.Align.START)
+            content_area.pack_start(label, False, False, 10)
+            
+            entry = Gtk.Entry()
+            entry.set_placeholder_text("What would you like to know?")
+            entry.set_activates_default(True)
+            content_area.pack_start(entry, False, False, 10)
+            
+            dialog.set_default_response(Gtk.ResponseType.OK)
+            dialog.show_all()
+            
+            # Set focus to the entry
+            entry.grab_focus()
+            
+            response = dialog.run()
+            
+            if response == Gtk.ResponseType.OK:
+                question = entry.get_text().strip()
+                if question:
+                    # Transform dialog to show processing and response
+                    self._transform_dialog_for_response(dialog, question)
+                else:
+                    dialog.destroy()
+                    self._present_message("No question entered")
+            else:
+                dialog.destroy()
+                self._present_message("Question cancelled")
+                
+        except Exception as e:
+            logger.error(f"Error showing question dialog: {e}")
+            self._present_message(f"Error showing question dialog: {e}")
+            
+    def _transform_dialog_for_response(self, dialog, question):
+        """Transform the question dialog to show AI processing and response."""
+        try:
+            # Clear existing content
+            content_area = dialog.get_content_area()
+            for child in content_area.get_children():
+                content_area.remove(child)
+                
+            # Remove existing buttons
+            for child in dialog.get_action_area().get_children():
+                dialog.get_action_area().remove(child)
+                
+            # Change title
+            dialog.set_title("AI Assistant Response")
+            
+            # Show question and processing message
+            question_label = Gtk.Label()
+            question_label.set_markup(f"<b>Question:</b> {question}")
+            question_label.set_line_wrap(True)
+            question_label.set_halign(Gtk.Align.START)
+            content_area.pack_start(question_label, False, False, 10)
+            
+            # Processing label (will be updated with response)
+            self._response_label = Gtk.Label(label="Processing your question...")
+            self._response_label.set_line_wrap(True)
+            self._response_label.set_halign(Gtk.Align.START)
+            self._response_label.set_selectable(True)  # Allow text selection
+            content_area.pack_start(self._response_label, True, True, 10)
+            
+            # Add close button
+            close_button = dialog.add_button(Gtk.STOCK_CLOSE, Gtk.ResponseType.CLOSE)
+            dialog.set_default_response(Gtk.ResponseType.CLOSE)
+            
+            # Resize for response content
+            dialog.set_default_size(600, 400)
+            dialog.show_all()
+            
+            # Focus the response label so screen reader announces it
+            self._response_label.grab_focus()
+            
+            # Process question asynchronously
+            self._process_user_question_async(dialog, question)
+            
+        except Exception as e:
+            logger.error(f"Error transforming dialog: {e}")
+            dialog.destroy()
+            self._present_message(f"Error showing response: {e}")
+            
+    def _process_user_question_async(self, dialog, question):
+        """Process the user's question and update dialog with response."""
+        try:
+            # Use the pre-captured screen data (captured before dialog opened)
+            data = self._current_screen_data
+            if data:
+                try:
+                    response = self._ai_provider.answer_question(
+                        question, 
+                        data.get('screenshot'), 
+                        data.get('accessibility')
+                    )
+                    
+                    # Update the response label
+                    self._response_label.set_markup(f"<b>Response:</b>\n{response}")
+                    
+                    # Also speak the response
+                    self._present_message(response)
+                    
+                    # Set up dialog close handler
+                    def on_response(dialog, response_id):
+                        dialog.destroy()
+                        
+                    dialog.connect("response", on_response)
+                    
+                except Exception as e:
+                    logger.error(f"Error getting AI response: {e}")
+                    self._response_label.set_markup(f"<b>Error:</b> {e}")
+                    self._present_message(f"Error getting AI response: {e}")
+            else:
+                self._response_label.set_markup("<b>Error:</b> No screen data available")
+                self._present_message("No screen data available")
+                
+        except Exception as e:
+            logger.error(f"Error processing user question: {e}")
+            self._response_label.set_markup(f"<b>Error:</b> {e}")
+            self._present_message(f"Error processing question: {e}")
@@ -1,4 +1,4 @@
-SUBDIRS = Clipboard DisplayVersion IndentationAudio PluginManager hello_world self_voice ByeCthulhu HelloCthulhu SimplePluginSystem
+SUBDIRS = AIAssistant Clipboard DisplayVersion IndentationAudio PluginManager hello_world self_voice ByeCthulhu HelloCthulhu SimplePluginSystem

 cthulhu_pythondir=$(pkgpythondir)/plugins

@@ -148,7 +148,15 @@ userCustomizableSettings = [
    "sayAllContextLandmark",
    "sayAllContextNonLandmarkForm",
    "sayAllContextList",
-    "sayAllContextTable"
+    "sayAllContextTable",
+    "aiAssistantEnabled",
+    "aiProvider",
+    "aiApiKeyFile",
+    "aiOllamaModel",
+    "aiConfirmationRequired",
+    "aiActionTimeout",
+    "aiScreenshotQuality",
+    "aiMaxContextLength"
 ]

 GENERAL_KEYBOARD_LAYOUT_DESKTOP = 1
@@ -188,6 +196,16 @@ CHAT_SPEAK_ALL             = 0
 CHAT_SPEAK_ALL_IF_FOCUSED  = 1
 CHAT_SPEAK_FOCUSED_CHANNEL = 2

+# AI Assistant constants
+AI_PROVIDER_CLAUDE = "claude"
+AI_PROVIDER_CHATGPT = "chatgpt"
+AI_PROVIDER_GEMINI = "gemini" 
+AI_PROVIDER_OLLAMA = "ollama"
+
+AI_SCREENSHOT_QUALITY_LOW = "low"
+AI_SCREENSHOT_QUALITY_MEDIUM = "medium"
+AI_SCREENSHOT_QUALITY_HIGH = "high"
+
 DEFAULT_VOICE           = "default"
 UPPERCASE_VOICE         = "uppercase"
 HYPERLINK_VOICE         = "hyperlink"
@@ -413,4 +431,14 @@ presentChatRoomLast = False
 presentLiveRegionFromInactiveTab = False

 # Plugins
-activePlugins = ['DisplayVersion', 'PluginManager', 'HelloCthulhu', 'ByeCthulhu']
+activePlugins = ['AIAssistant', 'DisplayVersion', 'PluginManager', 'HelloCthulhu', 'ByeCthulhu']
+
+# AI Assistant settings (disabled by default for opt-in behavior)
+aiAssistantEnabled = False
+aiProvider = AI_PROVIDER_CLAUDE
+aiApiKeyFile = ""
+aiOllamaModel = "llama3.2-vision"
+aiConfirmationRequired = True
+aiActionTimeout = 30
+aiScreenshotQuality = AI_SCREENSHOT_QUALITY_MEDIUM
+aiMaxContextLength = 4000