Skip to content

Conversation

@evanscastonguay
Copy link
Contributor

@evanscastonguay evanscastonguay commented Jan 10, 2026

User description

Minimal Jira issue provider integration: issue provider abstraction (Jira/GitLab/GitHub), /similar_issue Jira support, ticket compliance Jira/GitLab path, Jira ADF parsing, embedding robustness, and unit tests. No deploy/values changes.


PR Type

Enhancement, Tests


Description

  • Add Jira issue provider support with abstraction layer for GitHub/GitLab/Jira

  • Implement /similar_issue tool for GitLab and Jira with vector DB integration

  • Add ticket compliance checking for Jira and GitLab issue providers

  • Support flexible embedding client with OpenAI-compatible endpoints

  • Improve GitLab provider robustness for issue handling and clone URLs


Diagram Walkthrough

flowchart LR
  A["Issue Provider Abstraction"] --> B["Jira Provider"]
  A --> C["GitHub Provider"]
  A --> D["GitLab Provider"]
  B --> E["/similar_issue Tool"]
  C --> E
  D --> E
  E --> F["Vector DB Integration"]
  F --> G["Pinecone/LanceDB/Qdrant"]
  E --> H["Embedding Client"]
  H --> I["OpenAI-compatible Endpoint"]
  J["Ticket Compliance Check"] --> B
  J --> C
  J --> D
Loading

File Walkthrough

Relevant files
Enhancement
11 files
ticket_utils.py
Add Jira ticket key extraction utility                                     
+16/-0   
__init__.py
Create issue provider module exports                                         
+16/-0   
base.py
Define abstract issue provider interface                                 
+46/-0   
github_issue_provider.py
Implement GitHub issue provider adapter                                   
+21/-0   
gitlab_issue_provider.py
Implement GitLab issue provider adapter                                   
+20/-0   
jira_issue_provider.py
Implement Jira issue provider with ADF parsing                     
+220/-0 
resolver.py
Add issue provider resolution and factory logic                   
+53/-0   
embedding_client.py
Add OpenAI-compatible embedding client                                     
+75/-0   
gitlab_provider.py
Improve GitLab provider issue handling and clone URL robustness
+92/-19 
pr_similar_issue.py
Refactor /similar_issue tool for multi-provider support   
+374/-124
ticket_pr_compliance_check.py
Add Jira and GitLab ticket compliance extraction                 
+128/-33
Tests
4 files
test_issue_provider_resolver.py
Add issue provider resolver unit tests                                     
+17/-0   
test_jira_issue_provider.py
Add Jira issue provider unit tests                                             
+94/-0   
test_similar_issue_helpers.py
Add /similar_issue helper function tests                                 
+49/-0   
test_ticket_pr_compliance_check.py
Add ticket compliance check integration tests                       
+113/-0 
Documentation
2 files
fetching_ticket_context.md
Document Jira issue provider configuration                             
+10/-0   
similar_issues.md
Document GitLab and Jira /similar_issue support                   
+69/-3   
Configuration changes
1 files
configuration.toml
Add Jira and embedding configuration options                         
+39/-0   

Evans Castonguay added 11 commits January 10, 2026 16:10
(cherry picked from commit bfc288d0bd7e8277c7dc8f7033e6526c8cc308e6)
(cherry picked from commit 5c0ed614536b1ef7e118716f2b567d8b9fe1e87f)
(cherry picked from commit bbaec74feeafb31cc5794c0a82332792fd8bccb6)
(cherry picked from commit f90435f36a800a71529bf8e4fa7917f2375510c0)
(cherry picked from commit 2317af75758cca9c37f3916b2fbef9b27374a7c1)
(cherry picked from commit ef4a8a7930cbd39515a3010fff2a1ca66f8dd4fa)
(cherry picked from commit 358e5974d49ec4676154e669e68744e2c23a6c9c)
(cherry picked from commit ea28ff7)
(cherry picked from commit 98769ce)
@qodo-free-for-open-source-projects
Copy link
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
Token exposure in URL

Description: The access token is embedded directly into the clone URL string without proper
sanitization. If this URL is logged, displayed in error messages, or stored in version
control, the token could be exposed.
gitlab_provider.py [1030-1030]

Referred Code
    return f"{parsed.scheme}://oauth2:{access_token}@{netloc}{parsed.path}"
except Exception as exc:
API key exposure risk

Description: The OpenAI API key is directly assigned from settings without validation or sanitization,
potentially exposing sensitive credentials in logs or error messages if the embedding
client or OpenAI library logs request details.
pr_similar_issue.py [468-470]

Referred Code
openai.api_key = get_settings().openai.key
res = openai.Embedding.create(input=list_to_encode, engine=self.embedding_model)
return [record['embedding'] for record in res['data']]
Credential exposure in headers

Description: Jira API credentials (email and token) are concatenated and base64-encoded without
validation, then included in HTTP headers. If these credentials are logged or exposed
through error messages, they could be compromised.
jira_issue_provider.py [91-92]

Referred Code
auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")
request = urllib.request.Request(url)
API key in request headers

Description: The API key is included in the Authorization header without validation. If the requests
library or application logs HTTP headers, the bearer token could be exposed in logs.
embedding_client.py [35-35]

Referred Code
headers["Authorization"] = f"Bearer {self.api_key}"
Missing SSL verification

Description: The Jira API request uses urllib.request.urlopen without certificate verification
configuration. This could allow man-in-the-middle attacks if the HTTPS connection is not
properly validated, potentially exposing API credentials and sensitive issue data.
jira_issue_provider.py [96-102]

Referred Code
    with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
        payload = response.read().decode("utf-8")
        return json.loads(payload)
except Exception as exc:
    if not suppress_warning:
        get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
    return {}
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status: Passed

Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status:
Large function scope: The init method in PRSimilarIssue handles multiple concerns including initialization,
context resolution, embedding setup, and vector database configuration which may violate
single responsibility principle

Referred Code
def __init__(self, issue_url: str, ai_handler, args: list = None):
    self.issue_url = issue_url
    self.resource_url = issue_url.split('=')[-1] if issue_url else ""
    self.provider_name = get_settings().config.git_provider
    self.issue_provider_name = resolve_issue_provider_name(
        get_settings().get("CONFIG.ISSUE_PROVIDER", "auto"),
        self.provider_name,
    )
    self.supported = self.provider_name in ("github", "gitlab")
    self.git_provider = get_git_provider_with_context(self.resource_url)
    if not self.supported:
        return

    self._init_embedding_settings()
    self.repo_obj = None
    self.issue_iid = None
    self.project_path = None
    self.issue_context = False
    self.output_target = None
    self.issue_provider = None
    self.jira_keys = []


 ... (clipped 14 lines)
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix logic for limiting ticket links

Fix the logic for limiting ticket links in
extract_ticket_links_from_pr_description. Move the check to the beginning of the
loop and break when the limit is reached to correctly enforce it.

pr_agent/tools/ticket_pr_compliance_check.py [58-86]

 def extract_ticket_links_from_pr_description(pr_description, repo_path, base_url_html='https://github.com'):
     """
     Extract all ticket links from PR description
     """
     ticket_links = set()
     try:
         # Use the updated pattern to find matches
         matches = ISSUE_LINK_PATTERN.findall(pr_description)
 
         for match in matches:
+            if len(ticket_links) >= 3:
+                get_logger().info(f"Found more than 3 tickets in PR description, limiting to 3.")
+                break
+
             if match[0]:  # Full URL match
                 ticket_links.add(match[0])
             elif match[1]:  # Shorthand notation match: owner/repo#issue_number
                 owner, repo, issue_number = match[2], match[3], match[4]
                 ticket_links.add(f'{base_url_html.strip("/")}/{owner}/{repo}/issues/{issue_number}')
             else:  # #123 format
                 issue_number = match[5][1:]  # remove #
                 if issue_number.isdigit() and len(issue_number) < 5 and repo_path:
                     ticket_links.add(f'{base_url_html.strip("/")}/{repo_path}/issues/{issue_number}')
 
-            if len(ticket_links) > 3:
-                get_logger().info(f"Too many tickets found in PR description: {len(ticket_links)}")
-                # Limit the number of tickets to 3
-                ticket_links = set(list(ticket_links)[:3])
     except Exception as e:
         get_logger().error(f"Error extracting tickets error= {e}",
                            artifact={"traceback": traceback.format_exc()})
 
     return list(ticket_links)
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a logical flaw in how the number of ticket links is limited. The current implementation inside the loop is incorrect and does not guarantee the limit. The proposed change to check the count at the start of the loop and break is the correct way to enforce the limit, fixing a clear bug.

Medium
Improve error handling for comment fetching

Improve error handling in get_issue_comments by explicitly checking if self.mr
exists before fetching merge request comments and by adding a try...except block
when fetching comments for a specific issue.

pr_agent/git_providers/gitlab_provider.py [927-933]

 def get_issue_comments(self, issue=None):
     if issue is None:
+        if not self.mr:
+            get_logger().warning("No merge request context to get comments from.")
+            return []
         try:
             return self.mr.notes.list(get_all=True)[::-1]
-        except Exception:
+        except Exception as e:
+            get_logger().error(f"Failed to get merge request comments: {e}")
             return []
-    return list(issue.notes.list(iterator=True))
+    try:
+        return list(issue.notes.list(iterator=True))
+    except Exception as e:
+        get_logger().error(f"Failed to get issue comments: {e}")
+        return []
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies two potential failure points in the get_issue_comments function and proposes adding more robust error handling. Explicitly checking for self.mr and adding a try/except block for issue comment fetching makes the function more resilient and improves debuggability.

Medium
High-level
Use a dedicated Jira library

The suggestion recommends replacing the manual urllib-based Jira API
implementation with a dedicated library like jira-python. This would simplify
the code and improve robustness by abstracting away authentication, endpoint
management, and data parsing.

Examples:

pr_agent/issue_providers/jira_issue_provider.py [82-102]
    def _request_json(self, path: str, params: dict, api_version: Optional[int] = None, suppress_warning: bool = False) -> dict:
        if not self.is_configured():
            get_logger().warning("Jira client is not configured; skipping issue fetch")
            return {}
        query = urllib.parse.urlencode(params)
        version = api_version or self.api_version
        url = f"{self.base_url}/rest/api/{version}/{path}"
        if query:
            url = f"{url}?{query}"
        auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")

 ... (clipped 11 lines)

Solution Walkthrough:

Before:

# pr_agent/issue_providers/jira_issue_provider.py
class JiraIssueProvider(IssueProvider):
    def _request_json(self, path, params, ...):
        url = f"{self.base_url}/rest/api/{self.api_version}/{path}"
        if params:
            url = f"{url}?{urllib.parse.urlencode(params)}"

        auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode()).decode()
        request = urllib.request.Request(url)
        request.add_header("Authorization", f"Basic {auth_token}")

        with urllib.request.urlopen(request) as response:
            payload = response.read().decode("utf-8")
            return json.loads(payload)

    def get_issue(self, issue_id, ...):
        data = self._request_json(f"issue/{issue_id}", ...)
        return self._issue_from_payload(data)

After:

# pr_agent/issue_providers/jira_issue_provider.py
from jira import JIRA

class JiraIssueProvider(IssueProvider):
    def __init__(self, ...):
        # ...
        self.client = JIRA(
            server=self.base_url,
            basic_auth=(self.api_email, self.api_token)
        )

    def get_issue(self, issue_id, ...):
        jira_issue = self.client.issue(issue_id)
        return self._issue_from_jira_object(jira_issue)

    def _issue_from_jira_object(self, jira_issue):
        return Issue(
            key=jira_issue.key,
            title=jira_issue.fields.summary,
            ...
        )
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the Jira integration is built from scratch using urllib, and proposes a valid, more robust alternative using a dedicated library, which is a significant architectural improvement.

Medium
General
Improve error handling for API requests

Refactor the _request_json function to specifically handle
urllib.error.HTTPError and log the status code, improving debugging for Jira API
requests.

pr_agent/issue_providers/jira_issue_provider.py [82-102]

 def _request_json(self, path: str, params: dict, api_version: Optional[int] = None, suppress_warning: bool = False) -> dict:
     if not self.is_configured():
         get_logger().warning("Jira client is not configured; skipping issue fetch")
         return {}
     query = urllib.parse.urlencode(params)
     version = api_version or self.api_version
     url = f"{self.base_url}/rest/api/{version}/{path}"
     if query:
         url = f"{url}?{query}"
     auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")
     request = urllib.request.Request(url)
     request.add_header("Authorization", f"Basic {auth_token}")
     request.add_header("Accept", "application/json")
     try:
         with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
             payload = response.read().decode("utf-8")
             return json.loads(payload)
+    except urllib.error.HTTPError as exc:
+        if not suppress_warning:
+            get_logger().warning(
+                "Failed to fetch Jira issues due to HTTP error",
+                artifact={"error": str(exc), "status_code": exc.code, "url": url},
+            )
+        return {}
     except Exception as exc:
         if not suppress_warning:
             get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
         return {}
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly points out that handling specific HTTP errors is better than a generic except Exception. This improves logging and debuggability for network requests to the Jira API, which is a valuable improvement for robustness.

Low
Optimize the individual embedding fallback logic

Optimize the fallback logic in _embed_texts_with_fallback by directly calling
the embedding client for individual texts, avoiding the overhead of using the
batch-oriented wrapper in a loop.

pr_agent/tools/pr_similar_issue.py [472-485]

 def _embed_texts_with_fallback(self, list_to_encode: list[str]) -> tuple[list[list[float]], list[int]]:
     try:
         return self._embed_texts(list_to_encode), list(range(len(list_to_encode)))
     except Exception:
         get_logger().error('Failed to embed entire list, embedding one by one...')
         embeds = []
         successful_indices = []
         for idx, text in enumerate(list_to_encode):
             try:
-                embeds.append(self._embed_texts([text])[0])
+                if self.embedding_client:
+                    embedding = self.embedding_client.embed([text])[0]
+                else:
+                    openai.api_key = get_settings().openai.key
+                    res = openai.Embedding.create(input=[text], engine=self.embedding_model)
+                    embedding = res['data'][0]['embedding']
+                embeds.append(embedding)
                 successful_indices.append(idx)
             except Exception:
                 get_logger().warning("Failed to embed text segment; skipping.", artifact={"index": idx})
         return embeds, successful_indices
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies a minor performance inefficiency in the error-handling path for embedding texts. While the proposed fix is valid, it introduces code duplication from the _embed_texts method, and the performance gain is likely minimal as it only affects a fallback scenario.

Low
Learned
best practice
Capture exception objects for logging

The exception is captured as 'exc' but the code references it correctly.
However, in the _normalize_description method, there's a bare 'except
Exception:' without capturing the exception object, which prevents proper error
logging if needed.

pr_agent/issue_providers/jira_issue_provider.py [181-184]

 try:
-    with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
-        payload = response.read().decode("utf-8")
-        return json.loads(payload)
-except Exception as exc:
-    if not suppress_warning:
-        get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
-    return {}
+    return str(description)
+except Exception as e:
+    get_logger().debug(f"Failed to convert description to string: {e}")
+    return ""
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - When catching exceptions in try-except blocks, always capture the exception object using 'as e' syntax (e.g., 'except TypeError as e:') to enable proper error logging and debugging. This prevents NameError when trying to reference the exception in the except block.

Low
  • More
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

@atmajitsahu100
Copy link

@CodiumAI-Agent /improve

@naorpeled
Copy link
Collaborator

/agentic_review

@qodo-free-for-open-source-projects
Copy link
Contributor

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (6) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Missing _init_embedding_settings method definition 📘 Rule violation ✓ Correctness
Description
• The __init__ method calls self._init_embedding_settings() on line 35, but this method is not
  defined anywhere in the class.
• This will cause an AttributeError at runtime when the PRSimilarIssue class is instantiated.
• The method should initialize self.cli_mode, self.embedding_client, self.embedding_model,
  self.embedding_dim, self.embedding_max_tokens, and self.token_handler, which are all
  referenced later in the code.
Code

pr_agent/tools/pr_similar_issue.py[35]

+        self._init_embedding_settings()
Evidence
Line 35 calls self._init_embedding_settings(), but a search through the entire file shows this
method is never defined. The code references self.cli_mode (lines 69, 219),
self.embedding_client (line 465), self.embedding_model (line 469), self.embedding_dim (lines
243, 247, 252, 257, 264), self.embedding_max_tokens (lines 693, 711, 777, 795, 862, 880), and
self.token_handler (lines 693, 711, 777, 795, 862, 880), all of which should be initialized by
this missing method.

Rule 3: Robust Error Handling
pr_agent/tools/pr_similar_issue.py[35-35]
pr_agent/tools/pr_similar_issue.py[69-69]
pr_agent/tools/pr_similar_issue.py[465-466]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `PRSimilarIssue.__init__` method calls `self._init_embedding_settings()` on line 35, but this method is not defined, causing an AttributeError at runtime.

## Issue Context
The method should initialize:
- `self.cli_mode` from `get_settings().CONFIG.CLI_MODE`
- `self.token_handler` as `TokenHandler()`
- `self.embedding_model`, `self.embedding_dim`, `self.embedding_max_tokens` from `get_settings().pr_similar_issue`
- `self.embedding_client` conditionally based on whether a custom embedding base URL is configured

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Missing _init_repo_context method definition 📘 Rule violation ✓ Correctness
Description
• The __init__ method calls self._init_repo_context() on line 44, but this method is not defined
  anywhere in the class.
• This will cause an AttributeError at runtime when the PRSimilarIssue class is instantiated.
• The method should return a repository name string and initialize context-specific attributes like
  self.repo_obj, self.project_path, and self.issue_iid.
Code

pr_agent/tools/pr_similar_issue.py[44]

+        repo_name_for_index = self._init_repo_context()
Evidence
Line 44 calls repo_name_for_index = self._init_repo_context(), but a search through the entire
file shows this method is never defined. The file does define _init_github_context (line 587) and
_init_gitlab_context (line 594), which suggests _init_repo_context should dispatch to these
provider-specific methods based on self.provider_name.

Rule 3: Robust Error Handling
pr_agent/tools/pr_similar_issue.py[44-44]
pr_agent/tools/pr_similar_issue.py[587-592]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `PRSimilarIssue.__init__` method calls `self._init_repo_context()` on line 44, but this method is not defined, causing an AttributeError at runtime.

## Issue Context
The method should check `self.provider_name` and dispatch to either `_init_github_context()` or `_init_gitlab_context()` accordingly, returning the repository name string.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[44-44]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. _embed_texts_with_fallback returns wrong type 📘 Rule violation ✓ Correctness
Description
• The method _embed_texts_with_fallback returns a tuple (embeds, successful_indices), but
  callers in _update_index_with_issues and _update_table_with_issues assign the result directly to
  embeds without unpacking.
• This will cause the embeds variable to be a tuple instead of a list, leading to errors when
  trying to access embeddings.
• Lines 726, 810 assign the tuple directly to embeds or df["values"]/df["vector"], which
  expect a list of embeddings.
Code

pr_agent/tools/pr_similar_issue.py[726]

+        embeds = self._embed_texts_with_fallback(list_to_encode)
Evidence
The method signature at line 472 shows _embed_texts_with_fallback returns
tuple[list[list[float]], list[int]], but lines 726 and 810 assign the result directly without
unpacking. Line 729 then tries to access len(embeds[0]), which would fail if embeds is a tuple.
The _update_qdrant_with_issues method correctly unpacks the tuple at line 896.

Rule 3: Robust Error Handling
pr_agent/tools/pr_similar_issue.py[472-472]
pr_agent/tools/pr_similar_issue.py[726-726]
pr_agent/tools/pr_similar_issue.py[810-810]
pr_agent/tools/pr_similar_issue.py[896-896]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `_embed_texts_with_fallback` method returns a tuple `(embeds, successful_indices)`, but lines 726 and 810 assign the result directly without unpacking, causing type errors.

## Issue Context
The correct pattern is shown at line 896 in `_update_qdrant_with_issues`: `embeds, successful_indices = self._embed_texts_with_fallback(list_to_encode)`. The same unpacking should be applied in the other two methods.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[726-726]
- pr_agent/tools/pr_similar_issue.py[810-810]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
4. Missing cli_mode initialization causes error handling failure 🐞 Bug ⛯ Reliability
Description
• The code references self.cli_mode on lines 69 and 219 to determine whether to publish error
  messages
• This attribute is never initialized in __init__, causing AttributeError when Pinecone or
  Qdrant credentials are missing
• The error only occurs in error paths (missing credentials), not during normal operation
• This prevents users from seeing helpful error messages about missing configuration
Code

pr_agent/tools/pr_similar_issue.py[69]

                if not self.cli_mode:
-                    repo_name, original_issue_number = self.git_provider._parse_issue_url(self.issue_url.split('=')[-1])
-                    issue_main = self.git_provider.repo_obj.get_issue(original_issue_number)
-                    issue_main.create_comment("Please set pinecone api key and environment in secrets file")
Evidence
The cli_mode attribute is used in error handling blocks but never initialized. The old code had
self.cli_mode = get_settings().CONFIG.CLI_MODE which was removed.

pr_agent/tools/pr_similar_issue.py[69-80]
pr_agent/tools/pr_similar_issue.py[219-230]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue Description
The `cli_mode` attribute is used in error handling but never initialized, causing AttributeError when credentials are missing.

## Issue Context
The attribute determines whether to publish error messages to the PR/issue or just log them.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35] - Add initialization in `_init_embedding_settings` method
- pr_agent/tools/pr_similar_issue.py[69-69] - First usage location

Add: `self.cli_mode = get_settings().config.publish_output == False` or similar logic to determine CLI mode

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Tuple unpacking error in LanceDB embedding 🐞 Bug ✓ Correctness
Description
• The _embed_texts_with_fallback method returns a tuple (embeds, successful_indices) as defined
  on line 472
• Line 810 assigns this tuple directly to df['vector'] without unpacking: `embeds =
  self._embed_texts_with_fallback(list_to_encode)`
• This causes the DataFrame column to contain tuples instead of embedding vectors
• Subsequent operations expecting a list of vectors will fail, breaking LanceDB table creation
Code

pr_agent/tools/pr_similar_issue.py[R810-811]

+        embeds = self._embed_texts_with_fallback(list_to_encode)
        df["vector"] = embeds
Evidence
Same issue as with Pinecone - the method returns a tuple but the caller doesn't unpack it. The
Qdrant implementation (line 896) shows the correct pattern.

pr_agent/tools/pr_similar_issue.py[472-485]
pr_agent/tools/pr_similar_issue.py[810-811]
pr_agent/tools/pr_similar_issue.py[896-898]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue Description
Line 810 doesn&#x27;t unpack the tuple returned by `_embed_texts_with_fallback`, causing incorrect data structure in the DataFrame.

## Issue Context
The method returns both embeddings and indices of successfully embedded items. The Qdrant implementation (line 896) shows the correct pattern.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[810-811] - Fix tuple unpacking

Change:
```python
embeds = self._embed_texts_with_fallback(list_to_encode)
df[&quot;vector&quot;] = embeds
```

To:
```python
embeds, successful_indices = self._embed_texts_with_fallback(list_to_encode)
if len(successful_indices) != len(list_to_encode):
   df = df.iloc[successful_indices].reset_index(drop=True)
df[&quot;vector&quot;] = embeds
```

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Missing embedding configuration attributes 🐞 Bug ✓ Correctness
Description
• The code uses self.embedding_client (line 466), self.embedding_model (line 469),
  self.embedding_max_tokens (lines 693, 711, 777, 795, 862, 880), and self.embedding_dim (lines
  243, 247, 252, 257, 264)
• None of these attributes are initialized in __init__, causing AttributeError when accessed
• These should be initialized from configuration settings:
  get_settings().pr_similar_issue.embedding_*
• Without these, the embedding functionality cannot work with either custom endpoints or OpenAI
Code

pr_agent/tools/pr_similar_issue.py[R466-469]

+            return self.embedding_client.embed(list_to_encode)
+
+        openai.api_key = get_settings().openai.key
+        res = openai.Embedding.create(input=list_to_encode, engine=self.embedding_model)
Evidence
Multiple embedding-related attributes are used throughout the code but never initialized. The
configuration.toml file defines the settings structure, but there's no code to read them into
instance attributes.

pr_agent/settings/configuration.toml[387-391]
pr_agent/tools/pr_similar_issue.py[466-469]
pr_agent/tools/pr_similar_issue.py[243-243]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue Description
Multiple embedding configuration attributes are used but never initialized, causing AttributeError throughout the code.

## Issue Context
The configuration.toml defines embedding settings, but they&#x27;re not loaded into instance attributes.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35] - Implement `_init_embedding_settings` method
- pr_agent/settings/configuration.toml[387-391] - Read these settings

The method should initialize:
1. `self.embedding_model = get_settings().pr_similar_issue.embedding_model`
2. `self.embedding_dim = get_settings().pr_similar_issue.embedding_dim`
3. `self.embedding_max_tokens = get_settings().pr_similar_issue.embedding_max_tokens`
4. `self.embedding_client = EmbeddingClient(...)` if base_url is configured

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

7. Hard-coded timeout in JiraIssueProvider.__init__ 📘 Rule violation ✓ Correctness
Description
• The JiraIssueProvider.__init__ method has a hard-coded default timeout of 15 seconds in the
  parameter timeout_seconds: int = 15.
• This violates the configuration principle that all configuration values should be loaded from
  configuration files, not hard-coded in source code.
• The timeout should be configurable through the [jira] section in configuration.toml.
Code

pr_agent/issue_providers/jira_issue_provider.py[15]

+    def __init__(self, settings=None, project_path: Optional[str] = None, timeout_seconds: int = 15):
Evidence
The compliance rule requires all configuration to be adjustable through configuration files rather
than hard-coded in source code. The timeout_seconds: int = 15 parameter has a hard-coded default
value. While it can be overridden at instantiation, the default should come from configuration
settings.

AGENTS.md
pr_agent/issue_providers/jira_issue_provider.py[15-15]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `timeout_seconds` parameter has a hard-coded default value of 15, which should instead be loaded from configuration settings.

## Issue Context
The `[jira]` section in `configuration.toml` should include a `timeout_seconds` setting, and the `__init__` method should read it from `jira_settings.get(&quot;TIMEOUT_SECONDS&quot;, 15)`.

## Fix Focus Areas
- pr_agent/issue_providers/jira_issue_provider.py[15-28]
- pr_agent/settings/configuration.toml[193-202]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


8. Hard-coded timeout in EmbeddingClient.__init__ 📘 Rule violation ✓ Correctness
Description
• The EmbeddingClient.__init__ method has a hard-coded default timeout of 30 seconds in the
  parameter timeout_sec: int = 30.
• This violates the configuration principle that all configuration values should be loaded from
  configuration files, not hard-coded in source code.
• The timeout should be configurable through the [pr_similar_issue] section in
  configuration.toml.
Code

pr_agent/tools/embedding_client.py[13]

+    def __init__(self, base_url: str, model: str, api_key: str | None = None, timeout_sec: int = 30):
Evidence
The compliance rule requires all configuration to be adjustable through configuration files rather
than hard-coded in source code. The timeout_sec: int = 30 parameter has a hard-coded default
value. While it can be overridden at instantiation, the default should come from configuration
settings.

AGENTS.md
pr_agent/tools/embedding_client.py[13-13]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `timeout_sec` parameter has a hard-coded default value of 30, which should instead be loaded from configuration settings when the `EmbeddingClient` is instantiated.

## Issue Context
The `[pr_similar_issue]` section in `configuration.toml` should include an `embedding_timeout_sec` setting, and the caller should read it from settings before instantiating `EmbeddingClient`.

## Fix Focus Areas
- pr_agent/tools/embedding_client.py[13-17]
- pr_agent/settings/configuration.toml[384-391]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Duplicated ticket content building logic 📘 Rule violation ✓ Correctness
Description
• The extract_tickets function has duplicated logic for building ticket content dictionaries
  across Jira, GitLab, and GitHub providers.
• Each provider block (lines 124-131 for Jira, lines 155-162 for GitLab, lines 219-226 for GitHub)
  constructs similar dictionaries with ticket_id, ticket_url, title, body, labels, and
  sub_issues.
• This violates the DRY principle and makes maintenance harder when the ticket content structure
  needs to change.
Code

pr_agent/tools/ticket_pr_compliance_check.py[R124-131]

+                    tickets_content.append({
+                        "ticket_id": issue_main.key,
+                        "ticket_url": issue_main.url,
+                        "title": issue_main.title,
+                        "body": issue_body_str,
+                        "labels": ", ".join(issue_main.labels) if hasattr(issue_main, "labels") else "",
+                        "sub_issues": [],
+                    })
Evidence
The compliance rule requires duplicated conditional logic or code blocks to be extracted into helper
methods. The ticket content dictionary construction is repeated three times with only minor
variations in how fields are accessed (e.g., issue_main.key vs issue_main.iid vs
issue_main.number).

pr_agent/tools/ticket_pr_compliance_check.py[124-131]
pr_agent/tools/ticket_pr_compliance_check.py[155-162]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The ticket content dictionary construction is duplicated across three provider-specific blocks, violating the DRY principle.

## Issue Context
Create a helper method that accepts an issue object and optional parameters (like `ticket_url`, `sub_issues`) and returns a standardized dictionary. The method should handle provider-specific attribute access patterns using `getattr` with fallbacks.

## Fix Focus Areas
- pr_agent/tools/ticket_pr_compliance_check.py[124-131]
- pr_agent/tools/ticket_pr_compliance_check.py[155-162]
- pr_agent/tools/ticket_pr_compliance_check.py[219-226]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
10. Potential None access on GitLab id_project 🐞 Bug ⛯ Reliability
Description
• When initializing with a GitLab issue URL (not MR URL), the conditional on line 74 skips calling
  _set_merge_request, leaving id_project as None
• Line 612 in _init_gitlab_context accesses self.git_provider.id_project without checking if
  it's None
• While the code does set id_project for issue URLs on line 602, there's a code path where MR
  context is required (line 607-608) but id_project might still be None
• This could cause issues when calling gl.projects.get(self.project_path) with a None value
Code

pr_agent/tools/pr_similar_issue.py[R612-613]

+        self.project_path = self.git_provider.id_project
+        self.repo_obj = self.git_provider.gl.projects.get(self.project_path)
Evidence
The GitLab provider conditionally initializes id_project, and the similar_issue code accesses it
without null checking. While the code has logic to handle issue URLs, the MR path might not always
have id_project set.

pr_agent/git_providers/gitlab_provider.py[66-75]
pr_agent/tools/pr_similar_issue.py[607-613]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue Description
The code accesses `git_provider.id_project` without checking if it&#x27;s None, which could happen if the GitLab provider wasn&#x27;t initialized with an MR URL.

## Issue Context
The GitLab provider conditionally sets id_project only when _set_merge_request is called. The similar_issue code checks for mr but not id_project.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[612-613] - Add null check

Add validation:
```python
if not self.git_provider.id_project:
   raise Exception(&quot;GitLab project context is required for /similar_issue&quot;)
self.project_path = self.git_provider.id_project
```

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



ⓘ The new review experience is currently in Beta. Learn more

Qodo Logo

if not self.supported:
return

self._init_embedding_settings()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Missing _init_embedding_settings method definition 📘 Rule violation ✓ Correctness

• The __init__ method calls self._init_embedding_settings() on line 35, but this method is not
  defined anywhere in the class.
• This will cause an AttributeError at runtime when the PRSimilarIssue class is instantiated.
• The method should initialize self.cli_mode, self.embedding_client, self.embedding_model,
  self.embedding_dim, self.embedding_max_tokens, and self.token_handler, which are all
  referenced later in the code.
Agent prompt
## Issue description
The `PRSimilarIssue.__init__` method calls `self._init_embedding_settings()` on line 35, but this method is not defined, causing an AttributeError at runtime.

## Issue Context
The method should initialize:
- `self.cli_mode` from `get_settings().CONFIG.CLI_MODE`
- `self.token_handler` as `TokenHandler()`
- `self.embedding_model`, `self.embedding_dim`, `self.embedding_max_tokens` from `get_settings().pr_similar_issue`
- `self.embedding_client` conditionally based on whether a custom embedding base URL is configured

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

self.issue_provider = None
self.jira_keys = []

repo_name_for_index = self._init_repo_context()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Missing _init_repo_context method definition 📘 Rule violation ✓ Correctness

• The __init__ method calls self._init_repo_context() on line 44, but this method is not defined
  anywhere in the class.
• This will cause an AttributeError at runtime when the PRSimilarIssue class is instantiated.
• The method should return a repository name string and initialize context-specific attributes like
  self.repo_obj, self.project_path, and self.issue_iid.
Agent prompt
## Issue description
The `PRSimilarIssue.__init__` method calls `self._init_repo_context()` on line 44, but this method is not defined, causing an AttributeError at runtime.

## Issue Context
The method should check `self.provider_name` and dispatch to either `_init_github_context()` or `_init_gitlab_context()` accordingly, returning the repository name string.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[44-44]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

embeds.append(res['data'][0]['embedding'])
except:
embeds.append([0] * 1536)
embeds = self._embed_texts_with_fallback(list_to_encode)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. _embed_texts_with_fallback returns wrong type 📘 Rule violation ✓ Correctness

• The method _embed_texts_with_fallback returns a tuple (embeds, successful_indices), but
  callers in _update_index_with_issues and _update_table_with_issues assign the result directly to
  embeds without unpacking.
• This will cause the embeds variable to be a tuple instead of a list, leading to errors when
  trying to access embeddings.
• Lines 726, 810 assign the tuple directly to embeds or df["values"]/df["vector"], which
  expect a list of embeddings.
Agent prompt
## Issue description
The `_embed_texts_with_fallback` method returns a tuple `(embeds, successful_indices)`, but lines 726 and 810 assign the result directly without unpacking, causing type errors.

## Issue Context
The correct pattern is shown at line 896 in `_update_qdrant_with_issues`: `embeds, successful_indices = self._embed_texts_with_fallback(list_to_encode)`. The same unpacking should be applied in the other two methods.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[726-726]
- pr_agent/tools/pr_similar_issue.py[810-810]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@@ -45,9 +67,17 @@ def __init__(self, issue_url: str, ai_handler, args: list = None):
environment = get_settings().pinecone.environment
except Exception:
if not self.cli_mode:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Missing cli_mode initialization causes error handling failure 🐞 Bug ⛯ Reliability

• The code references self.cli_mode on lines 69 and 219 to determine whether to publish error
  messages
• This attribute is never initialized in __init__, causing AttributeError when Pinecone or
  Qdrant credentials are missing
• The error only occurs in error paths (missing credentials), not during normal operation
• This prevents users from seeing helpful error messages about missing configuration
Agent prompt
## Issue Description
The `cli_mode` attribute is used in error handling but never initialized, causing AttributeError when credentials are missing.

## Issue Context
The attribute determines whether to publish error messages to the PR/issue or just log them.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35] - Add initialization in `_init_embedding_settings` method
- pr_agent/tools/pr_similar_issue.py[69-69] - First usage location

Add: `self.cli_mode = get_settings().config.publish_output == False` or similar logic to determine CLI mode

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +810 to 811
embeds = self._embed_texts_with_fallback(list_to_encode)
df["vector"] = embeds

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

5. Tuple unpacking error in lancedb embedding 🐞 Bug ✓ Correctness

• The _embed_texts_with_fallback method returns a tuple (embeds, successful_indices) as defined
  on line 472
• Line 810 assigns this tuple directly to df['vector'] without unpacking: `embeds =
  self._embed_texts_with_fallback(list_to_encode)`
• This causes the DataFrame column to contain tuples instead of embedding vectors
• Subsequent operations expecting a list of vectors will fail, breaking LanceDB table creation
Agent prompt
## Issue Description
Line 810 doesn't unpack the tuple returned by `_embed_texts_with_fallback`, causing incorrect data structure in the DataFrame.

## Issue Context
The method returns both embeddings and indices of successfully embedded items. The Qdrant implementation (line 896) shows the correct pattern.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[810-811] - Fix tuple unpacking

Change:
```python
embeds = self._embed_texts_with_fallback(list_to_encode)
df["vector"] = embeds
```

To:
```python
embeds, successful_indices = self._embed_texts_with_fallback(list_to_encode)
if len(successful_indices) != len(list_to_encode):
    df = df.iloc[successful_indices].reset_index(drop=True)
df["vector"] = embeds
```

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +466 to +469
return self.embedding_client.embed(list_to_encode)

openai.api_key = get_settings().openai.key
res = openai.Embedding.create(input=list_to_encode, engine=self.embedding_model)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

6. Missing embedding configuration attributes 🐞 Bug ✓ Correctness

• The code uses self.embedding_client (line 466), self.embedding_model (line 469),
  self.embedding_max_tokens (lines 693, 711, 777, 795, 862, 880), and self.embedding_dim (lines
  243, 247, 252, 257, 264)
• None of these attributes are initialized in __init__, causing AttributeError when accessed
• These should be initialized from configuration settings:
  get_settings().pr_similar_issue.embedding_*
• Without these, the embedding functionality cannot work with either custom endpoints or OpenAI
Agent prompt
## Issue Description
Multiple embedding configuration attributes are used but never initialized, causing AttributeError throughout the code.

## Issue Context
The configuration.toml defines embedding settings, but they're not loaded into instance attributes.

## Fix Focus Areas
- pr_agent/tools/pr_similar_issue.py[35-35] - Implement `_init_embedding_settings` method
- pr_agent/settings/configuration.toml[387-391] - Read these settings

The method should initialize:
1. `self.embedding_model = get_settings().pr_similar_issue.embedding_model`
2. `self.embedding_dim = get_settings().pr_similar_issue.embedding_dim`
3. `self.embedding_max_tokens = get_settings().pr_similar_issue.embedding_max_tokens`
4. `self.embedding_client = EmbeddingClient(...)` if base_url is configured

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants