Skip to content

MoeMoeCS/LLM-issue

Repository files navigation

GitHub Issue Summarizer

Python License Stars

简体中文 | English

A CLI tool to fetch, filter, and summarize GitHub issues, with one-line summaries generated by LLMs (e.g., OpenAI).


🚀 Quick Start (Zero Hassle)

  1. Install dependencies:

    pip install -r requirements.txt
  2. Run with one command:

    python main.py owner/repo --max-issues 30
    • Replace owner/repo with the target GitHub repository, e.g. microsoft/vscode.
    • --max-issues controls how many open issues to fetch (default 50, recommended for large repos).
    • No need to manually export environment variables!
    • If GH_TOKEN or OPENAI_API_KEY is missing, you will be prompted to enter them once, and they will be saved to .env for future runs.
  3. Results:

    • Output files are saved to the output/ directory automatically.
    • Includes Markdown, JSON, and CSV formats.
  4. Project Structure:


Features

  • 智能 Issue 抓取: 从任何 GitHub 仓库获取开放 Issues
  • 智能过滤: 自动过滤已分配、已关闭或噪声 Issues
  • 自动分类: 根据标题和内容自动分类 Issues 类型和优先级
  • LLM 摘要生成: 使用大语言模型生成一句话摘要
  • 多格式输出: 支持 Markdown、JSON、CSV 等多种输出格式
  • 高性能缓存: 双层缓存系统(内存 + SQLite)提升性能
  • 配置验证: 启动时自动验证配置的完整性和有效性
  • 错误处理: 完善的异常处理和重试机制
  • 批量处理: 支持并发处理,自动速率限制
  • 质量保证: 摘要质量检查和本地降级机制
  • 用户友好: 丰富的进度显示和彩色输出

Advanced Usage

Optional Arguments

  • --max-issues N Limit the number of open issues to fetch (default 50). Useful for large repositories to speed up the process.

Optional Environment Variables

You can still use environment variables or .env to customize behavior:

# Caching options
CACHE_DB_PATH=.cache/cache.db         # Cache database location
CACHE_MAX_MEMORY_ITEMS=1000           # Max items in memory cache
CACHE_CLEANUP_INTERVAL=3600           # Cache cleanup interval (seconds)

# LLM options
LLM_CONCURRENCY_LIMIT=10              # Max concurrent LLM requests
OPENAI_BASE_URL=https://api.openai.com/v1  # API endpoint
MODEL_NAME=gpt-3.5-turbo              # Model to use

Configuration

  • Edit config.py to customize:
    • Keywords for issue classification
    • Priority rules and labels
    • LLM prompt templates
    • Filter rules for issues

Output Formats

The tool generates multiple output formats:

1. Markdown Summary

  • Project overview with issue statistics
  • Detailed table with issue information
  • One-line summaries generated by LLM

2. JSON Export

  • Structured data with metadata
  • Complete issue information
  • Machine-readable format

3. CSV Export

  • Spreadsheet-compatible format
  • Easy to import into Excel/Google Sheets
  • All issue fields included

4. Console Display

  • Rich formatted tables
  • Color-coded output
  • Real-time progress indicators

Example Markdown output:

# microsoft/vscode Issues 速览

目前共有 **50** 个待解决 Issue(Bug 20 个 / 新功能 15 个),平均优先级 P1,最新更新于 2024-03-20。

| #Issue | 类型 | 优先级 | 标题 | 一句话摘要 | 关键标签 | 创建时间 | 地址 |
|--------|------|----------|-------|---------|------------|---------|------|
| #123 | Bug | P1 | Login fails | 「登录页面在高并发下崩溃」 | backend, critical | 2024-03-19 | 🔗 |

Dependencies

  • Python 3.8+
  • httpx
  • typer
  • pydantic
  • rich
  • openai
  • python-dotenv

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages