Skip to content

enhancement of documentation and emprovement of error handling #62

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 90 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,48 +72,111 @@ This is a tutorial project of [Pocket Flow](https://github.com/The-Pocket/Pocket

## 🚀 Getting Started

1. Clone this repository
1. **Clone this repository**
```bash
git clone https://github.com/The-Pocket/Tutorial-Codebase-Knowledge.git
cd Tutorial-Codebase-Knowledge
```

2. Install dependencies:
2. **Set up a virtual environment (recommended)**
```bash
pip install -r requirements.txt
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

3. Set up LLM in [`utils/call_llm.py`](./utils/call_llm.py) by providing credentials. By default, you can use the AI Studio key with this client for Gemini Pro 2.5:
3. **Install dependencies**
```bash
pip install -r requirements.txt
```

```python
client = genai.Client(
api_key=os.getenv("GEMINI_API_KEY", "your-api_key"),
)
4. **Configure LLM access**

The tool supports multiple LLM providers. Configure at least one:

- **Google Gemini (default)**:
```bash
# For Vertex AI:
export GEMINI_PROJECT_ID="your-project-id"
export GEMINI_LOCATION="us-central1"
# OR for AI Studio:
export GEMINI_API_KEY="your-api-key"
```

- **Anthropic Claude**:
```bash
export ANTHROPIC_API_KEY="your-api-key"
# Uncomment Claude function in utils/call_llm.py
```

- **OpenAI**:
```bash
export OPENAI_API_KEY="your-api-key"
# Uncomment OpenAI function in utils/call_llm.py
```

5. **Set up GitHub token (recommended)**
```bash
export GITHUB_TOKEN="your-github-token"
```

You can use your own models. We highly recommend the latest models with thinking capabilities (Claude 3.7 with thinking, O1). You can verify that it is correctly set up by running:
6. **Verify your setup**
```bash
python utils/call_llm.py
```

4. Generate a complete codebase tutorial by running the main script:
```bash
# Analyze a GitHub repository
python main.py --repo https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000
7. **Generate a tutorial**
```bash
# From a GitHub repository
python main.py --repo https://github.com/username/repo --include "*.py" "*.js"

# Or from a local directory
python main.py --dir /path/to/your/codebase --include "*.py"
```

# Or, analyze a local directory
python main.py --dir /path/to/your/codebase --include "*.py" --exclude "*test*"
For detailed setup instructions, see [SETUP.md](./SETUP.md).

# Or, generate a tutorial in Chinese
python main.py --repo https://github.com/username/repo --language "Chinese"
```
## 🚀 How to Run This Project

- `--repo` or `--dir` - Specify either a GitHub repo URL or a local directory path (required, mutually exclusive)
- `-n, --name` - Project name (optional, derived from URL/directory if omitted)
- `-t, --token` - GitHub token (or set GITHUB_TOKEN environment variable)
- `-o, --output` - Output directory (default: ./output)
- `-i, --include` - Files to include (e.g., "*.py" "*.js")
- `-e, --exclude` - Files to exclude (e.g., "tests/*" "docs/*")
- `-s, --max-size` - Maximum file size in bytes (default: 100KB)
- `--language` - Language for the generated tutorial (default: "english")
1. **Set up environment variables** (choose one option):

Option 1: For Google Gemini (default):
```bash
export GEMINI_PROJECT_ID="your-project-id"
export GEMINI_LOCATION="us-central1"
# OR for AI Studio instead of Vertex AI:
export GEMINI_API_KEY="your-api-key"
```

Option 2: For Anthropic Claude (uncomment in call_llm.py):
```bash
export ANTHROPIC_API_KEY="your-api-key"
```

Option 3: For OpenAI O1 (uncomment in call_llm.py):
```bash
export OPENAI_API_KEY="your-api-key"
```

The application will crawl the repository, analyze the codebase structure, generate tutorial content in the specified language, and save the output in the specified directory (default: ./output).
2. **Test LLM connection**:
```bash
python utils/call_llm.py
```

3. **Generate a tutorial from a GitHub repository**:
```bash
python main.py --repo https://github.com/username/repo --include "*.py"
```

4. **Or analyze a local codebase**:
```bash
python main.py --dir /path/to/your/code --include "*.py" "*.js"
```

5. **Check the generated output**:
```bash
cd output
# View the generated tutorial files
```

## 💡 Development Tutorial

Expand Down
150 changes: 150 additions & 0 deletions SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Detailed Setup Guide for AI Codebase Knowledge Builder

This guide provides comprehensive instructions for setting up and configuring the AI Codebase Knowledge Builder tool.

## Prerequisites

- Python 3.8 or newer
- Git (for cloning repositories)
- Access to at least one of the supported LLM providers:
- Google Gemini (default)
- Anthropic Claude (optional)
- OpenAI (optional)

## Step 1: Clone the Repository

```bash
git clone https://github.com/The-Pocket/Tutorial-Codebase-Knowledge.git
cd Tutorial-Codebase-Knowledge
```

## Step 2: Create and Activate a Virtual Environment (Recommended)

### For Linux/macOS
```bash
python -m venv venv
source venv/bin/activate
```

### For Windows
```bash
python -m venv venv
venv\Scripts\activate
```

## Step 3: Install Dependencies

```bash
pip install -r requirements.txt
```

## Step 4: Configure LLM Access

You need to set up access to at least one Language Model provider. The project uses Google Gemini by default but supports others.

### Option 1: Google Gemini (Default)

Choose one of these methods:

#### Using Vertex AI
1. Create a Google Cloud project and enable Vertex AI
2. Set environment variables:
```bash
export GEMINI_PROJECT_ID="your-project-id"
export GEMINI_LOCATION="us-central1" # Or your preferred region
```
For Windows:
```
set GEMINI_PROJECT_ID=your-project-id
set GEMINI_LOCATION=us-central1
```

#### Using AI Studio
1. Get an API key from [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Set the API key as an environment variable:
```bash
export GEMINI_API_KEY="your-api-key"
```
For Windows:
```
set GEMINI_API_KEY=your-api-key
```

### Option 2: Anthropic Claude

1. Get an API key from [Anthropic](https://console.anthropic.com/)
2. Set the API key:
```bash
export ANTHROPIC_API_KEY="your-api-key"
```
For Windows:
```
set ANTHROPIC_API_KEY=your-api-key
```
3. Edit `utils/call_llm.py` to uncomment the Claude implementation and comment out other implementations

### Option 3: OpenAI

1. Get an API key from [OpenAI](https://platform.openai.com/)
2. Set the API key:
```bash
export OPENAI_API_KEY="your-api-key"
```
For Windows:
```
set OPENAI_API_KEY=your-api-key
```
3. Edit `utils/call_llm.py` to uncomment the OpenAI implementation and comment out other implementations

## Step 5: GitHub Token (Optional but Recommended)

For accessing GitHub repositories, especially private ones or to avoid rate limits:

1. Generate a GitHub token at [GitHub Settings](https://github.com/settings/tokens)
- For public repositories: Select `public_repo` scope
- For private repositories: Select `repo` scope
2. Set the token:
```bash
export GITHUB_TOKEN="your-github-token"
```
For Windows:
```
set GITHUB_TOKEN=your-github-token
```

## Step 6: Verify Setup

Test your LLM configuration:

```bash
python utils/call_llm.py
```

You should see a response from the configured LLM provider.

## Troubleshooting

### LLM Connection Issues
- **Error**: "Failed to connect to LLM API"
- Check your API keys and environment variables
- Verify network connection
- Ensure the correct model name is specified

### GitHub Access Issues
- **Error**: "Repository not found"
- Check if the repository exists and is accessible
- Verify GitHub token permissions
- For private repositories, ensure your token has the `repo` scope

### File Size Limitations
- **Error**: "Skipping file: size exceeds limit"
- Increase the `--max-size` parameter for larger files
- Or exclude large files using the `--exclude` parameter

## Additional Configuration

- Create a `.env` file in the project root to store environment variables permanently
- Customize logging by modifying the `LOG_DIR` environment variable
- Adjust caching behavior by editing the cache settings in `utils/call_llm.py`

For more information, refer to the main [README.md](./README.md).
Loading