The-Pocket · Otmanesabiri · Apr 28, 2025 · Apr 28, 2025 · Apr 28, 2025
diff --git a/README.md b/README.md
@@ -72,48 +72,111 @@ This is a tutorial project of [Pocket Flow](https://github.com/The-Pocket/Pocket
 
 ## 🚀 Getting Started
 
-1. Clone this repository
+1. **Clone this repository**
+   ```bash
+   git clone https://github.com/The-Pocket/Tutorial-Codebase-Knowledge.git
+   cd Tutorial-Codebase-Knowledge
+   ```
 
-2. Install dependencies:
+2. **Set up a virtual environment (recommended)**
    ```bash
-   pip install -r requirements.txt
+   python -m venv venv
+   source venv/bin/activate  # On Windows: venv\Scripts\activate
    ```
 
-3. Set up LLM in [`utils/call_llm.py`](./utils/call_llm.py) by providing credentials. By default, you can use the AI Studio key with this client for Gemini Pro 2.5:
+3. **Install dependencies**
+   ```bash
+   pip install -r requirements.txt
+   ```
 
-   ```python
-   client = genai.Client(
-     api_key=os.getenv("GEMINI_API_KEY", "your-api_key"),
-   )
+4. **Configure LLM access**
+
+   The tool supports multiple LLM providers. Configure at least one:
+
+   - **Google Gemini (default)**: 
+     ```bash
+     # For Vertex AI:
+     export GEMINI_PROJECT_ID="your-project-id"
+     export GEMINI_LOCATION="us-central1"
+     # OR for AI Studio:
+     export GEMINI_API_KEY="your-api-key"
+     ```
+
+   - **Anthropic Claude**:
+     ```bash
+     export ANTHROPIC_API_KEY="your-api-key"
+     # Uncomment Claude function in utils/call_llm.py
+     ```
+
+   - **OpenAI**:
+     ```bash
+     export OPENAI_API_KEY="your-api-key"
+     # Uncomment OpenAI function in utils/call_llm.py
+     ```
+
+5. **Set up GitHub token (recommended)**
+   ```bash
+   export GITHUB_TOKEN="your-github-token"
    ```
 
-   You can use your own models. We highly recommend the latest models with thinking capabilities (Claude 3.7 with thinking, O1). You can verify that it is correctly set up by running:
+6. **Verify your setup**
    ```bash
    python utils/call_llm.py
    ```
 
-4. Generate a complete codebase tutorial by running the main script:
-    ```bash
-    # Analyze a GitHub repository
-    python main.py --repo https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000
+7. **Generate a tutorial**
+   ```bash
+   # From a GitHub repository
+   python main.py --repo https://github.com/username/repo --include "*.py" "*.js"
+
+   # Or from a local directory
+   python main.py --dir /path/to/your/codebase --include "*.py"
+   ```
 
-    # Or, analyze a local directory
-    python main.py --dir /path/to/your/codebase --include "*.py" --exclude "*test*"
+For detailed setup instructions, see [SETUP.md](./SETUP.md).
 
-    # Or, generate a tutorial in Chinese
-    python main.py --repo https://github.com/username/repo --language "Chinese"
-    ```
+## 🚀 How to Run This Project
 
-    - `--repo` or `--dir` - Specify either a GitHub repo URL or a local directory path (required, mutually exclusive)
-    - `-n, --name` - Project name (optional, derived from URL/directory if omitted)
-    - `-t, --token` - GitHub token (or set GITHUB_TOKEN environment variable)
-    - `-o, --output` - Output directory (default: ./output)
-    - `-i, --include` - Files to include (e.g., "*.py" "*.js")
-    - `-e, --exclude` - Files to exclude (e.g., "tests/*" "docs/*")
-    - `-s, --max-size` - Maximum file size in bytes (default: 100KB)
-    - `--language` - Language for the generated tutorial (default: "english")
+1. **Set up environment variables** (choose one option):
+
+   Option 1: For Google Gemini (default):
+   ```bash
+   export GEMINI_PROJECT_ID="your-project-id"
+   export GEMINI_LOCATION="us-central1"
+   # OR for AI Studio instead of Vertex AI:
+   export GEMINI_API_KEY="your-api-key"
+   ```
+
+   Option 2: For Anthropic Claude (uncomment in call_llm.py):
+   ```bash
+   export ANTHROPIC_API_KEY="your-api-key"
+   ```
+
+   Option 3: For OpenAI O1 (uncomment in call_llm.py):
+   ```bash
+   export OPENAI_API_KEY="your-api-key"
+   ```
 
-The application will crawl the repository, analyze the codebase structure, generate tutorial content in the specified language, and save the output in the specified directory (default: ./output).
+2. **Test LLM connection**:
+   ```bash
+   python utils/call_llm.py
+   ```
+
+3. **Generate a tutorial from a GitHub repository**:
+   ```bash
+   python main.py --repo https://github.com/username/repo --include "*.py"
+   ```
+
+4. **Or analyze a local codebase**:
+   ```bash
+   python main.py --dir /path/to/your/code --include "*.py" "*.js"
+   ```
+
+5. **Check the generated output**:
+   ```bash
+   cd output
+   # View the generated tutorial files
+   ```
 
 ## 💡 Development Tutorial
 

diff --git a/SETUP.md b/SETUP.md
@@ -0,0 +1,150 @@
+# Detailed Setup Guide for AI Codebase Knowledge Builder
+
+This guide provides comprehensive instructions for setting up and configuring the AI Codebase Knowledge Builder tool.
+
+## Prerequisites
+
+- Python 3.8 or newer
+- Git (for cloning repositories)
+- Access to at least one of the supported LLM providers:
+  - Google Gemini (default)
+  - Anthropic Claude (optional)
+  - OpenAI (optional)
+
+## Step 1: Clone the Repository
+
+```bash
+git clone https://github.com/The-Pocket/Tutorial-Codebase-Knowledge.git
+cd Tutorial-Codebase-Knowledge
+```
+
+## Step 2: Create and Activate a Virtual Environment (Recommended)
+
+### For Linux/macOS
+```bash
+python -m venv venv
+source venv/bin/activate
+```
+
+### For Windows
+```bash
+python -m venv venv
+venv\Scripts\activate
+```
+
+## Step 3: Install Dependencies
+
+```bash
+pip install -r requirements.txt
+```
+
+## Step 4: Configure LLM Access
+
+You need to set up access to at least one Language Model provider. The project uses Google Gemini by default but supports others.
+
+### Option 1: Google Gemini (Default)
+
+Choose one of these methods:
+
+#### Using Vertex AI
+1. Create a Google Cloud project and enable Vertex AI
+2. Set environment variables:
+   ```bash
+   export GEMINI_PROJECT_ID="your-project-id"
+   export GEMINI_LOCATION="us-central1"  # Or your preferred region
+   ```
+   For Windows:
+   ```
+   set GEMINI_PROJECT_ID=your-project-id
+   set GEMINI_LOCATION=us-central1
+   ```
+
+#### Using AI Studio
+1. Get an API key from [Google AI Studio](https://makersuite.google.com/app/apikey)
+2. Set the API key as an environment variable:
+   ```bash
+   export GEMINI_API_KEY="your-api-key"
+   ```
+   For Windows:
+   ```
+   set GEMINI_API_KEY=your-api-key
+   ```
+
+### Option 2: Anthropic Claude
+
+1. Get an API key from [Anthropic](https://console.anthropic.com/)
+2. Set the API key:
+   ```bash
+   export ANTHROPIC_API_KEY="your-api-key"
+   ```
+   For Windows:
+   ```
+   set ANTHROPIC_API_KEY=your-api-key
+   ```
+3. Edit `utils/call_llm.py` to uncomment the Claude implementation and comment out other implementations
+
+### Option 3: OpenAI
+
+1. Get an API key from [OpenAI](https://platform.openai.com/)
+2. Set the API key:
+   ```bash
+   export OPENAI_API_KEY="your-api-key"
+   ```
+   For Windows:
+   ```
+   set OPENAI_API_KEY=your-api-key
+   ```
+3. Edit `utils/call_llm.py` to uncomment the OpenAI implementation and comment out other implementations
+
+## Step 5: GitHub Token (Optional but Recommended)
+
+For accessing GitHub repositories, especially private ones or to avoid rate limits:
+
+1. Generate a GitHub token at [GitHub Settings](https://github.com/settings/tokens)
+   - For public repositories: Select `public_repo` scope
+   - For private repositories: Select `repo` scope
+2. Set the token:
+   ```bash
+   export GITHUB_TOKEN="your-github-token"
+   ```
+   For Windows:
+   ```
+   set GITHUB_TOKEN=your-github-token
+   ```
+
+## Step 6: Verify Setup
+
+Test your LLM configuration:
+
+```bash
+python utils/call_llm.py
+```
+
+You should see a response from the configured LLM provider.
+
+## Troubleshooting
+
+### LLM Connection Issues
+- **Error**: "Failed to connect to LLM API"
+  - Check your API keys and environment variables
+  - Verify network connection
+  - Ensure the correct model name is specified
+
+### GitHub Access Issues
+- **Error**: "Repository not found"
+  - Check if the repository exists and is accessible
+  - Verify GitHub token permissions
+  - For private repositories, ensure your token has the `repo` scope
+
+### File Size Limitations
+- **Error**: "Skipping file: size exceeds limit"
+  - Increase the `--max-size` parameter for larger files
+  - Or exclude large files using the `--exclude` parameter
+
+## Additional Configuration
+
+- Create a `.env` file in the project root to store environment variables permanently
+- Customize logging by modifying the `LOG_DIR` environment variable
+- Adjust caching behavior by editing the cache settings in `utils/call_llm.py`
+
+For more information, refer to the main [README.md](./README.md).