suitedaces · RealShocky · Mar 5, 2025
diff --git a/PROVIDER_DEBUGGING.md b/PROVIDER_DEBUGGING.md
@@ -0,0 +1,75 @@
+# Grunty AI Multi-Provider Debugging Report
+
+## Issue Summary
+The Grunty AI application was experiencing problems with its multi-provider support, particularly when switching between Anthropic and OpenAI providers. The issues included:
+
+1. Error handling during provider switching
+2. Lack of proper error feedback to users
+3. Initialization issues with the OpenAI provider
+4. Missing log functionality in the UI
+
+## Implemented Fixes
+
+### 1. Enhanced Error Logging
+- Added detailed logging throughout the application with file names and line numbers
+- Added console logging for immediate feedback during development
+- Added stack trace logging for better debugging
+- Improved log formatting for better readability
+
+### 2. Improved Provider Initialization
+- Added proper initialization checks in the OpenAI provider
+- Added verification of API key availability
+- Added API test call during initialization to verify connectivity
+- Better error handling during provider creation and initialization
+
+### 3. Enhanced Provider Switching
+- Added more robust provider switching logic in the store
+- Only recreate provider instances when necessary
+- Proper error handling and recovery during provider switching
+- Added user feedback through error dialogs when provider switching fails
+
+### 4. OpenAI Provider Improvements
+- Implemented proper computer control support
+- Fixed message handling for the OpenAI API responses
+- Added robust error handling for tool calls
+- Improved response handling for different message formats
+
+### 5. UI Improvements
+- Added missing log method to MainWindow class
+- Improved error message display in the UI
+- Added better user feedback during provider operations
+
+### 6. Dependency Management
+- Better handling of optional dependencies
+- Clear error messages when required packages are missing
+- Graceful degradation when non-essential packages are unavailable
+
+## Configuration
+The application requires proper configuration in a `.env` file:
+
+```
+ANTHROPIC_API_KEY=your_anthropic_key
+OPENAI_API_KEY=your_openai_key
+DEFAULT_AI_PROVIDER=anthropic
+```
+
+## Testing
+
+A new test script `test_providers.py` has been created to validate the provider functionality independently of the main application. This script tests:
+- Anthropic provider creation and initialization
+- OpenAI provider creation and initialization
+- Provider manager functionality
+
+All tests are passing, confirming that both providers are working correctly.
+
+## Recommendations for Future Work
+
+1. **Comprehensive Error Handling**: Add more specific error checks for different API errors
+2. **Provider Configuration UI**: Add a dedicated settings page for provider configuration
+3. **API Key Management**: Implement secure storage and management of API keys
+4. **Automated Testing**: Expand the test coverage to include more complex scenarios
+5. **New Providers**: Create a template for adding new AI providers easily
+
+## Conclusion
+
+The multi-provider support in Grunty AI is now working correctly. Users can switch between Anthropic and OpenAI providers with proper error handling and feedback. The application is more robust and user-friendly.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # 👨🏽‍💻 Grunty
 
-Self-hosted desktop app to have AI control your computer, powered by the new Claude [computer use](https://www.anthropic.com/news/3-5-models-and-computer-use) capability. Allow Claude to take over your laptop and do your tasks for you (or at least attempt to, lol). Written in Python, using PyQt.
+Self-hosted desktop app to have AI control your computer, powered by the Claude [computer use](https://www.anthropic.com/news/3-5-models-and-computer-use) capability and OpenAI's GPT models. Allow AI to take over your laptop and do your tasks for you (or at least attempt to, lol). Written in Python, using PyQt.
 
 ## Demo
 Here, I asked it to use [vim](https://vim.rtorr.com/) to create a game in Python, run it, and play it.
@@ -15,31 +15,51 @@ Video was sped up 8x btw. [Computer use](https://www.anthropic.com/news/3-5-mode
 
 2. **Tread Lightly** - If it wipes your computer, sends weird emails, or orders 100 pizzas... that's on you. 
 
-Anthropic can see your screen through screenshots during actions. Hide sensitive information or private stuff.
+AI providers can see your screen through screenshots during actions. Hide sensitive information or private stuff.
 
 ## 🎯 Features
 - Literally ask AI to do ANYTHING on your computer that you do with a mouse and keyboard. Browse the web, write code, blah blah.
+- **Multiple AI providers support**: Switch between Anthropic Claude and OpenAI models
+- **Model selection**: Choose from various models for each provider
+- **Theme toggling**: Light/Dark mode support
+- **System tray integration**: Minimize to tray and run in background
+- **Optional voice control**: Experimental voice input and text-to-speech support
 
 # 💻 Platforms
 - Anything you can run Python on: MacOS, Windows, Linux, etc.
 
 ## 🛠️ Setup
 
-Get an Anthropic API key [here]([https://console.anthropic.com/keys](https://console.anthropic.com/dashboard)).
+Get an Anthropic API key [here](https://console.anthropic.com/dashboard) and/or an OpenAI API key [here](https://platform.openai.com/api-keys).
 
 ```bash
 # Python 3.10+ recommended
 python -m venv venv
 source venv/bin/activate  # or `venv\Scripts\activate` on Windows
 pip install -r requirements.txt
 
-# Add API key to .env
+# Add API keys to .env
 echo "ANTHROPIC_API_KEY=your-key-here" > .env
+echo "OPENAI_API_KEY=your-key-here" >> .env
+echo "DEFAULT_AI_PROVIDER=anthropic" >> .env  # or "openai"
 
 # Run
 python run.py
 ```
 
+## 🧠 Supported AI Providers and Models
+
+### Anthropic
+- Claude 3.5 Sonnet
+- Claude 3 Opus
+- Claude 3 Sonnet
+- Claude 3 Haiku
+
+### OpenAI
+- GPT-4o
+- GPT-4 Turbo
+- GPT-4
+
 ## 🔑 Productivity Keybindings
 - `Ctrl + Enter`: Execute the current instruction
 - `Ctrl + C`: Stop the current agent action
@@ -50,10 +70,13 @@ python run.py
 - Claude really loves Firefox. You might want to install it for better UI detection and accurate mouse clicks.
 - Be specific and explicit, help it out a bit
 - Always monitor the agent's actions
+- Different models have different capabilities for computer control - experiment to find the best one for your tasks
 
 ## 🐛 Known Issues
 
-- Sometimes, it doesn't take a screenshot to validate that the input is selected, and types stuff in the wrong place.. Press CMD+C to end the action when this happens, and quit and restart the agent. I'm working on a fix.
+- Sometimes, the AI doesn't take a screenshot to validate that the input is selected, and types stuff in the wrong place. Press CMD+C to end the action when this happens, and quit and restart the agent.
+- Not all models support full computer control with the same level of capability
+- Voice control is experimental and may not work reliably on all platforms
 
 ## 🤝 Contributing
 

diff --git a/requirements.txt b/requirements.txt
@@ -1,12 +1,18 @@
+# Core dependencies
 PyQt6
 pyautogui
-requests
-anthropic
 python-dotenv
 pillow
 numpy
 qtawesome
-SpeechRecognition
-pyttsx3
-keyboard
-pyaudio
+requests
+
+# AI Provider dependencies
+anthropic>=0.15.0  # Required for Anthropic Claude
+openai>=1.17.0    # Optional for OpenAI support
+
+# Voice control dependencies (optional)
+SpeechRecognition  # Optional for voice input
+pyttsx3           # Optional for text-to-speech
+pyaudio           # Optional for voice recording
+keyboard          # For keyboard shortcuts
diff --git a/src/ai_providers.py b/src/ai_providers.py
@@ -0,0 +1,172 @@
+import os
+import logging
+from abc import ABC, abstractmethod
+from typing import List, Dict, Any, Optional
+from dotenv import load_dotenv
+
+logger = logging.getLogger(__name__)
+
+class AIProvider(ABC):
+    """Base abstract class for AI providers that can control the computer."""
+
+    def __init__(self, api_key: Optional[str] = None):
+        self.api_key = api_key
+
+    @abstractmethod
+    def initialize(self) -> bool:
+        """Initialize the client with API key and any needed setup.
+        Returns True if successful, False otherwise."""
+        pass
+
+    @abstractmethod
+    def get_next_action(self, run_history: List[Dict[str, Any]]) -> Any:
+        """Get the next action from the AI based on the conversation history.
+
+        Args:
+            run_history: List of conversation messages.
+
+        Returns:
+            Response object from the AI provider.
+        """
+        pass
+
+    @abstractmethod
+    def extract_action(self, response: Any) -> Dict[str, Any]:
+        """Extract the action from the AI response.
+
+        Args:
+            response: Response object from the AI provider.
+
+        Returns:
+            Dict with the parsed action.
+        """
+        pass
+
+    @abstractmethod
+    def display_assistant_message(self, message: Any, update_callback: callable) -> None:
+        """Format and display the assistant's message.
+
+        Args:
+            message: The message from the assistant.
+            update_callback: Callback function to update the UI with the message.
+        """
+        pass
+
+    @abstractmethod
+    def get_prompt_for_model(self, model_id: str) -> str:
+        """Get the prompt formatted for the specific model.
+
+        Args:
+            model_id: The model ID to get the prompt for.
+
+        Returns:
+            Formatted prompt string.
+        """
+        pass
+
+    @staticmethod
+    def get_available_models() -> List[Dict[str, str]]:
+        """Get a list of available models for this provider.
+
+        Returns:
+            List of dictionaries with model information.
+        """
+        return []
+
+    @staticmethod
+    def default_model() -> str:
+        """Get the default model ID for this provider.
+
+        Returns:
+            Default model ID string.
+        """
+        return ""
+
+# Manager class to handle multiple AI providers
+class AIProviderManager:
+    """Manager for different AI provider integrations."""
+
+    PROVIDERS = {
+        "anthropic": "AnthropicProvider",
+        "openai": "OpenAIProvider"
+        # Add more providers here as they are implemented
+    }
+
+    @staticmethod
+    def get_provider_names() -> List[str]:
+        """Get a list of available provider names.
+
+        Returns:
+            List of provider name strings.
+        """
+        return list(AIProviderManager.PROVIDERS.keys())
+
+    @staticmethod
+    def create_provider(provider_name: str, **kwargs) -> Optional[AIProvider]:
+        """Factory method to create an AI provider.
+
+        Args:
+            provider_name: Name of the provider to create.
+            **kwargs: Additional arguments to pass to the provider constructor.
+
+        Returns:
+            AIProvider instance or None if creation failed.
+        """
+        logger.info(f"Creating AI provider: {provider_name} with kwargs: {kwargs}")
+
+        # Dynamically import providers without circular imports
+        if provider_name == "anthropic":
+            try:
+                from .anthropic_provider import AnthropicProvider
+                provider = AnthropicProvider(**kwargs)
+                success = provider.initialize()
+                if success:
+                    logger.info(f"Successfully created and initialized AnthropicProvider")
+                    return provider
+                else:
+                    logger.error(f"Failed to initialize AnthropicProvider")
+                    return None
+            except ImportError as e:
+                logger.error(f"Failed to import AnthropicProvider: {str(e)}")
+                return None
+            except Exception as e:
+                import traceback
+                logger.error(f"Error creating AnthropicProvider: {str(e)}\n{traceback.format_exc()}")
+                return None
+        elif provider_name == "openai":
+            try:
+                # First check if openai package is installed
+                try:
+                    import openai
+                    logger.info("OpenAI package found")
+                except ImportError as e:
+                    logger.error(f"OpenAI package not installed: {str(e)}")
+                    return None
+
+                # Then try to import our provider
+                from .openai_provider import OpenAIProvider
+                logger.info("Creating OpenAIProvider instance")
+                provider = OpenAIProvider(**kwargs)
+
+                # Initialize the provider
+                logger.info("Initializing OpenAIProvider")
+                success = provider.initialize()
+
+                if success:
+                    logger.info("Successfully created and initialized OpenAIProvider")
+                    return provider
+                else:
+                    logger.error("Failed to initialize OpenAIProvider")
+                    return None
+            except ImportError as e:
+                logger.error(f"Failed to import OpenAIProvider: {str(e)}")
+                return None
+            except Exception as e:
+                import traceback
+                logger.error(f"Error creating OpenAIProvider: {str(e)}\n{traceback.format_exc()}")
+                return None
+
+        # Add more provider imports here as they are implemented
+
+        logger.error(f"Unknown provider name: {provider_name}")
+        return None