Skip to content

Commit e4c48e1

Browse files
committed
feat: Add GitBit intelligent issue bot
0 parents  commit e4c48e1

File tree

11 files changed

+443
-0
lines changed

11 files changed

+443
-0
lines changed

.gitbit.yml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# .gitbit.yml
2+
3+
# --- Smart Tagging Configuration ---
4+
# Map labels to keywords. The bot will suggest a label if an issue's
5+
# title or body contains any of the associated keywords.
6+
tag_keywords:
7+
bug:
8+
- error
9+
- exception
10+
- traceback
11+
- panic
12+
- crash
13+
- fail
14+
documentation:
15+
- docs
16+
- readme
17+
- help
18+
- example
19+
- tutorial
20+
feature-request:
21+
- feature
22+
- enhance
23+
- improvement
24+
- idea
25+
question:
26+
- how to
27+
- what is
28+
- why
29+
30+
# --- Assignee Recommendation Configuration ---
31+
# The number of recently closed issues to scan to build an expertise profile.
32+
# A higher number is more accurate but slower.
33+
assignee_rec:
34+
max_issues_to_scan: 100
35+
36+
# --- Issue Linking Configuration ---
37+
# The similarity score required to consider an issue "related".
38+
# Value should be between 0.0 and 1.0. Higher is more strict.
39+
# A good starting point is 0.7.
40+
issue_linking:
41+
similarity_threshold: 0.7

.github/workflows/gitbit_bot.yml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# .github/workflows/gitbit_bot.yml
2+
3+
name: GitBit Bot
4+
5+
on:
6+
issues:
7+
types: [opened]
8+
9+
jobs:
10+
run-gitbit:
11+
runs-on: ubuntu-latest
12+
permissions:
13+
issues: write
14+
contents: read
15+
steps:
16+
- name: Checkout repository
17+
uses: actions/checkout@v3
18+
19+
- name: Set up Python
20+
uses: actions/setup-python@v4
21+
with:
22+
python-version: '3.10'
23+
24+
- name: Install dependencies
25+
run: pip install -r requirements.txt
26+
27+
- name: Run GitBit Bot
28+
env:
29+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
30+
GITHUB_EVENT_PATH: ${{ github.event_path }}
31+
GITHUB_REPOSITORY: ${{ github.repository }}
32+
run: python gitbit/main.py

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 [Your Name or Organization Here]
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# GitBit: Intelligent Issue Management Bot for GitHub
2+
3+
<!-- You can create a simple logo -->
4+
5+
GitBit is a Python-based GitHub bot that streamlines issue management for repository maintainers. It uses natural language processing (NLP) and machine learning to automate repetitive tasks, making it an invaluable tool for managing large open-source projects.
6+
7+
## Key Features
8+
9+
- **🤖 Automatic Issue Linking:** Analyzes new issue descriptions to detect and link semantically related or duplicate issues, reducing clutter and improving organization.
10+
- **🏷️ Smart Tagging:** Suggests relevant labels (e.g., `bug`, `documentation`, `feature-request`) based on the issue's content, ensuring consistent and meaningful categorization.
11+
- **👤 Assignee Recommendations:** Recommends contributors to assign issues to, based on their past contributions and expertise demonstrated in previously closed issues.
12+
13+
## Why It’s Unique
14+
15+
While there are many tools for GitHub automation, GitBit combines NLP and machine learning to provide **intelligent, context-aware suggestions** tailored to each repository. Its focus on issue management—a critical yet time-consuming task—sets it apart from generic bots or static analysis tools.
16+
17+
## How It Works
18+
19+
GitBit is deployed as a GitHub Action that triggers whenever a new issue is opened in your repository. Here's the process:
20+
21+
1. **Trigger:** A new issue is created.
22+
2. **Analysis:** The bot reads the issue's title and body.
23+
3. **Processing:**
24+
* It compares the new issue's text to a keyword map in your config file to suggest labels.
25+
* It scans recently closed issues to find which users are experts on topics related to the suggested labels.
26+
* It uses TF-IDF vectorization and cosine similarity to find other open issues with similar content.
27+
4. **Comment:** The bot posts a single, helpful comment on the new issue with all its suggestions, allowing maintainers to review and apply them with a single click.
28+
29+
---
30+
31+
## 🚀 Setup Instructions
32+
33+
Setting up GitBit takes less than 5 minutes.
34+
35+
### Step 1: Create the Configuration File
36+
37+
In the root of your repository, create a file named `.gitbit.yml`. This file controls the bot's behavior.
38+
39+
**Copy and paste this template into `.gitbit.yml` and customize it for your project:**
40+
41+
```yaml
42+
# .gitbit.yml
43+
44+
# --- Smart Tagging Configuration ---
45+
# Map labels to keywords. The bot will suggest a label if an issue's
46+
# title or body contains any of the associated keywords.
47+
tag_keywords:
48+
bug:
49+
- error
50+
- exception
51+
- traceback
52+
- panic
53+
- crash
54+
- fail
55+
documentation:
56+
- docs
57+
- readme
58+
- help
59+
- example
60+
- tutorial
61+
feature-request:
62+
- feature
63+
- enhance
64+
- improvement
65+
- idea
66+
question:
67+
- how to
68+
- what is
69+
- why
70+
71+
# --- Assignee Recommendation Configuration ---
72+
# The number of recently closed issues to scan to build an expertise profile.
73+
# A higher number is more accurate but slower.
74+
assignee_rec:
75+
max_issues_to_scan: 100
76+
77+
# --- Issue Linking Configuration ---
78+
# The similarity score required to consider an issue "related".
79+
# Value should be between 0.0 and 1.0. Higher is more strict.
80+
# A good starting point is 0.7.
81+
issue_linking:
82+
similarity_threshold: 0.7

gitbit/__init__.py

Whitespace-only changes.

gitbit/assignee.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# gitbit/assignee.py
2+
3+
from collections import defaultdict
4+
5+
def recommend_assignees(repo, suggested_tags: list[str], max_issues_to_scan: int) -> list[str]:
6+
"""
7+
Recommends assignees based on their history of closing issues with similar tags.
8+
9+
Args:
10+
repo: The PyGithub repository object.
11+
suggested_tags: A list of tags suggested for the new issue.
12+
max_issues_to_scan: The number of closed issues to analyze.
13+
14+
Returns:
15+
A list of recommended assignee usernames, sorted by relevance.
16+
"""
17+
if not suggested_tags:
18+
return []
19+
20+
# Build an expertise map: {label: {user: count}}
21+
expertise_map = defaultdict(lambda: defaultdict(int))
22+
23+
# Scan recent closed issues to build expertise profile
24+
closed_issues = repo.get_issues(state='closed', sort='updated', direction='desc')
25+
26+
count = 0
27+
for issue in closed_issues:
28+
if count >= max_issues_to_scan:
29+
break
30+
31+
# Only consider issues that were actually assigned and have labels
32+
if issue.assignee and issue.labels:
33+
for label in issue.labels:
34+
expertise_map[label.name][issue.assignee.login] += 1
35+
count += 1
36+
37+
if not expertise_map:
38+
return []
39+
40+
# Find best assignees for the suggested tags
41+
recommendation_scores = defaultdict(int)
42+
for tag in suggested_tags:
43+
if tag in expertise_map:
44+
# Find the top expert for this tag
45+
top_expert = max(expertise_map[tag], key=expertise_map[tag].get)
46+
recommendation_scores[top_expert] += 1 # Give a point to this expert
47+
48+
# Sort recommended assignees by their score (how many tags they are an expert in)
49+
sorted_recommendations = sorted(recommendation_scores.keys(), key=lambda user: recommendation_scores[user], reverse=True)
50+
51+
return sorted_recommendations

gitbit/bot.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# gitbit/bot.py
2+
3+
import yaml
4+
from . import tagger, assignee, linker
5+
6+
class GitBitBot:
7+
"""
8+
The main bot class that orchestrates the analysis of a GitHub issue.
9+
"""
10+
def __init__(self, repo, issue):
11+
self.repo = repo
12+
self.issue = issue
13+
self.config = self._load_config()
14+
15+
def _load_config(self):
16+
"""Loads the .gitbit.yml configuration file from the repository root."""
17+
try:
18+
config_content = self.repo.get_contents(".gitbit.yml").decoded_content
19+
return yaml.safe_load(config_content)
20+
except Exception as e:
21+
print(f"Error loading .gitbit.yml: {e}")
22+
raise FileNotFoundError(".gitbit.yml not found or is invalid.")
23+
24+
def run(self):
25+
"""
26+
Runs the full analysis pipeline and posts a summary comment on the issue.
27+
"""
28+
issue_text = f"{self.issue.title} {self.issue.body}"
29+
30+
# 1. Suggest tags
31+
suggested_tags = tagger.suggest_tags(issue_text, self.config.get('tag_keywords', {}))
32+
33+
# 2. Recommend assignees
34+
max_scan = self.config.get('assignee_rec', {}).get('max_issues_to_scan', 100)
35+
recommended_assignees = assignee.recommend_assignees(self.repo, suggested_tags, max_scan)
36+
37+
# 3. Find related issues
38+
threshold = self.config.get('issue_linking', {}).get('similarity_threshold', 0.7)
39+
related_issues = linker.find_related_issues(self.repo, self.issue, threshold)
40+
41+
# 4. Format and post the comment
42+
comment = self._format_comment(suggested_tags, recommended_assignees, related_issues)
43+
44+
if comment:
45+
self.issue.create_comment(comment)
46+
print("Successfully posted suggestions to the issue.")
47+
else:
48+
print("No suggestions to post.")
49+
50+
def _format_comment(self, tags, assignees, issues):
51+
"""Formats the bot's findings into a clean Markdown comment."""
52+
if not tags and not assignees and not issues:
53+
return None
54+
55+
comment_parts = ["### 🤖 GitBit Bot Analysis\n\nI've analyzed this issue and have the following suggestions:\n"]
56+
57+
if tags:
58+
tag_list = ", ".join([f"`{tag}`" for tag in tags])
59+
comment_parts.append(f"**🏷️ Suggested Labels:**\n{tag_list}\n")
60+
61+
if assignees:
62+
assignee_list = ", ".join([f"@{user}" for user in assignees])
63+
comment_parts.append(f"**👤 Recommended Assignees:**\n{assignee_list} (based on their work on related issues)\n")
64+
65+
if issues:
66+
issue_list = "\n".join([f"- Issue #{i.number}: {i.title}" for i in issues])
67+
comment_parts.append(f"**🔗 Potentially Related Issues:**\n{issue_list}\n")
68+
69+
comment_parts.append("---\n*I am a bot. My suggestions are based on patterns in this repository. Please review them before applying.*")
70+
71+
return "\n".join(comment_parts)

gitbit/linker.py

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# gitbit/linker.py
2+
3+
from sklearn.feature_extraction.text import TfidfVectorizer
4+
from sklearn.metrics.pairwise import cosine_similarity
5+
6+
def find_related_issues(repo, new_issue, similarity_threshold: float) -> list:
7+
"""
8+
Finds issues that are semantically similar to a new issue.
9+
10+
Args:
11+
repo: The PyGithub repository object.
12+
new_issue: The new issue object to compare against.
13+
similarity_threshold: The minimum cosine similarity score to be considered related.
14+
15+
Returns:
16+
A list of PyGithub Issue objects that are potentially related.
17+
"""
18+
open_issues = repo.get_issues(state='open')
19+
20+
# Create a corpus of documents (issue title + body)
21+
issue_map = {}
22+
corpus = []
23+
24+
for issue in open_issues:
25+
# Exclude the new issue itself from the comparison list
26+
if issue.number == new_issue.number:
27+
continue
28+
29+
issue_text = f"{issue.title} {issue.body}"
30+
issue_map[len(corpus)] = issue # Map corpus index to issue object
31+
corpus.append(issue_text)
32+
33+
if not corpus:
34+
return [] # No other open issues to compare against
35+
36+
# Add the new issue's text to the end of the corpus for vectorization
37+
new_issue_text = f"{new_issue.title} {new_issue.body}"
38+
corpus.append(new_issue_text)
39+
40+
# Vectorize the text using TF-IDF
41+
try:
42+
vectorizer = TfidfVectorizer(stop_words='english', min_df=1)
43+
tfidf_matrix = vectorizer.fit_transform(corpus)
44+
except ValueError:
45+
# This can happen if the corpus is empty or contains only stop words.
46+
return []
47+
48+
# Calculate cosine similarity between the new issue and all others
49+
# The new issue is the last one in the matrix
50+
cosine_sim = cosine_similarity(tfidf_matrix[-1], tfidf_matrix[:-1])
51+
52+
# Find issues that exceed the similarity threshold
53+
related_issues = []
54+
similar_indices = cosine_sim[0].argsort()[:-5:-1] # Get top 4 most similar indices
55+
56+
for i in similar_indices:
57+
if cosine_sim[0][i] > similarity_threshold:
58+
related_issues.append(issue_map[i])
59+
60+
return related_issues

0 commit comments

Comments
 (0)