Skip to content

Commit cf83593

Browse files
authored
Add recipe for Llama Triaging & Reporting Tool (meta-llama#651)
2 parents fe7ac56 + e7e41e1 commit cf83593

File tree

21 files changed

+3552
-0
lines changed

21 files changed

+3552
-0
lines changed

.github/scripts/spellcheck_conf/wordlist.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1451,6 +1451,10 @@ openhathi
14511451
sarvam
14521452
subtask
14531453
acc
1454+
Triaging
1455+
matplotlib
1456+
remediations
1457+
walkthrough
14541458
OCRVQA
14551459
OCRVQADataCollator
14561460
ocrvqa

recipes/use_cases/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
## [Automatic Triaging of Github Repositories](./github_triage/walkthrough.ipynb): Use Llama to automatically triage issues in an OSS repository and generate insights to improve community experience
2+
This tool utilizes an off-the-shelf Llama model to analyze, generate insights, and create a report for better understanding of the state of a repository. It serves as a reference implementation for using Llama to develop custom reporting and data analytics applications.
3+
14
## [VideoSummary](video_summary.ipynb): Ask Llama 3 to Summarize a Long YouTube Video (using Replicate or [OctoAI](../3p_integrations/octoai/video_summary.ipynb))
25
This demo app uses Llama 3 to return a text summary of a YouTube video. It shows how to retrieve the caption of a YouTube video and how to ask Llama to summarize the content in different ways, from the simplest naive way that works for short text to more advanced methods of using LangChain's map_reduce and refine to overcome the 8K context length limit of Llama 3.
36

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Automatic Issues Triaging with Llama
2+
3+
This tool utilizes an off-the-shelf Llama model to analyze, generate insights, and create a report for better understanding of the state of a repository. It serves as a reference implementation for using Llama to develop custom reporting and data analytics applications.
4+
5+
## Features
6+
7+
The tool performs the following tasks:
8+
9+
* Fetches issue threads from a specified repository
10+
* Analyzes issue discussions and generates annotations such as category, severity, component affected, etc.
11+
* Categorizes all issues by theme
12+
* Synthesizes key challenges faced by users, along with probable causes and remediations
13+
* Generates a high-level executive summary providing insights on diagnosing and improving the developer experience
14+
15+
For a step-by-step look, check out the [walkthrough notebook](walkthrough.ipynb).
16+
17+
## Getting Started
18+
19+
20+
### Installation
21+
22+
```bash
23+
pip install -r requirements.txt
24+
```
25+
26+
### Setup
27+
28+
1. **API Keys and Model Service**: Set your GitHub token for API calls. Some privileged information may not be available if you don't have push-access to the target repository.
29+
2. **Model Configuration**: Set the appropriate values in the `model` section of [config.yaml](config.yaml) for using Llama via VLLM or Groq.
30+
3. **JSON Schemas**: Edit the output JSON schemas in [config.yaml](config.yaml) to ensure consistency in outputs. VLLM supports JSON-decoding via the `guided_json` generation argument, while Groq requires passing the schema in the system prompt.
31+
32+
### Running the Tool
33+
34+
```bash
35+
python triage.py --repo_name='meta-llama/llama-recipes' --start_date='2024-08-14' --end_date='2024-08-27'
36+
```
37+
38+
### Output
39+
40+
The tool generates:
41+
42+
* CSV files with `annotations`, `challenges`, and `overview` data, which can be persisted in SQL tables for downstream analyses and reporting.
43+
* Graphical matplotlib plots of repository traffic, maintenance activity, and issue attributes.
44+
* A PDF report for easier reading and sharing.
45+
46+
## Config
47+
48+
The tool's configuration is stored in [config.yaml](config.yaml). The following sections can be edited:
49+
50+
* **Github Token**: Use a token that has push-access on the target repo.
51+
* **model**: Specify the model service (`vllm` or `groq`) and set the endpoints and API keys as applicable.
52+
* **prompts**: For each of the 3 tasks Llama does in this tool, we specify a prompt and an output JSON schema:
53+
* `parse_issue`: Parsing and generating annotations for the issues
54+
* `assign_category`: Assigns each issue to a category specified in an enum in the corresponding JSON schema
55+
* `get_overview`: Generates a high-level executive summary and analysis of all the parsed and generated data
56+
57+
## Troubleshooting
58+
59+
* If you encounter issues with API calls, ensure that your GitHub token is set correctly and that you have the necessary permissions.
60+
* If you encounter issues with the model service, check the configuration values in [config.yaml](config.yaml).
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
github_token: <github token>
2+
model:
3+
use: groq
4+
vllm:
5+
endpoint: "http://localhost:8000/v1"
6+
model_id: "meta-llama/Meta-Llama-3.1-70B-Instruct"
7+
groq:
8+
key: <groq token>
9+
model_id: llama-3.1-70b-versatile
10+
11+
prompts:
12+
parse_issue:
13+
system: You are an expert maintainer of an open source project. Given some discussion threads, you must respond with a report in JSON. Your response should only contain English, and you may translate if you can.
14+
json_schema: '{
15+
"type": "object",
16+
"properties": {
17+
"summary": {
18+
"description": "Summary of the issue and discussion along with any details about platform or tooling.",
19+
"type": "string"
20+
},
21+
"possible_causes": {
22+
"type": "array",
23+
"items": {
24+
"type": "string"
25+
}
26+
},
27+
"remediations": {
28+
"description": "How we can improve the code or documentation to prevent this issue.",
29+
"type": "array",
30+
"maxItems": 2,
31+
"items": {
32+
"type": "string"
33+
}
34+
},
35+
"component": {
36+
"description": "The specific module or component affected by the issue",
37+
"type": "string"
38+
},
39+
"sentiment": {
40+
"type": "string",
41+
"enum": ["positive", "negative", "neutral"]
42+
},
43+
"issue_type": {
44+
"description": "Any issue not related to LLMs, Llama or code in this repository should be marked as \"invalid\"",
45+
"type": "string",
46+
"enum": ["bug_report", "feature_request", "documentation", "installation", "discussion", "invalid"]
47+
},
48+
"severity": {
49+
"type": "string",
50+
"enum": ["critical", "major", "minor", "trivial"]
51+
},
52+
"op_expertise": {
53+
"description": "Assess the reporters level of expertise.",
54+
"type": "string",
55+
"enum": ["beginner", "intermediate", "advanced"]
56+
}
57+
},
58+
"required": ["summary", "possible_causes", "remediations", "component", "sentiment", "issue_type", "severity", "op_expertise"]
59+
}'
60+
assign_category:
61+
system: "You are the lead maintainer of an open source project. Given a list of issues, generate a JSON that categorizes the issues by common themes. For every theme include a description and cite the relevant issue numbers. All issues must be categorized into at least one theme."
62+
json_schema: '{
63+
"type": "object",
64+
"properties": {
65+
"report": {
66+
"type": "array",
67+
"items": {
68+
"type": "object",
69+
"properties": {
70+
"theme": {
71+
"description": "key theme identified from the issues",
72+
"type": "string",
73+
"enum": ["Cloud Compute", "Installation and Environment", "Model Loading", "Model Fine-tuning and Training", "Model Conversion", "Model Inference", "Distributed Training and Multi-GPU", "Performance and Optimization", "Quantization and Mixed Precision", "Documentation", "CUDA Compatibility", "Model Evaluation and Benchmarking", "Miscellaneous", "Invalid"]
74+
},
75+
"description": {
76+
"type": "string"
77+
},
78+
"related_issues": {
79+
"description": "Issue numbers related to this theme",
80+
"type": "array",
81+
"items": {
82+
"type": "number"
83+
}
84+
}
85+
},
86+
"required": ["theme", "description", "related_issues"]
87+
}
88+
}
89+
},
90+
"required": ["report"]
91+
}'
92+
get_overview:
93+
system: You are not only an experienced Open Source maintainer, but also an expert at paraphrasing raw data into clear succinct reports. Draft a concise report about the issues in this open source repository. Include an executive summary that provides an overview of the challenges faced, any open questions or decisions to be made, or actions that we can take. Group issues together if they ladder up to the same overall challenge, summarize the challenges and include any actionable resolutions we can take (more information in the \"remediations\" sections). Use your experience and judgement to ignore issues that are clearly unrelated to the open source project. Ensure the output is in JSON.
94+
json_schema: '{
95+
"type": "object",
96+
"properties": {
97+
"executive_summary": {
98+
"description": "An executive summary of the analysis",
99+
"type": "string"
100+
},
101+
"open_questions": {
102+
"description": "Any open questions or decisions that the product team needs to make in light of these issues",
103+
"type": "array",
104+
"items": {
105+
"type": "string"
106+
}
107+
},
108+
"issue_analysis": {
109+
"type": "array",
110+
"items": {
111+
"type": "object",
112+
"properties": {
113+
"key_challenge": {
114+
"description": "A description of the challenge reported in these issues",
115+
"type": "string"
116+
},
117+
"affected_issues": {
118+
"description": "A list of issues that are related to this challenge",
119+
"type": "array",
120+
"items": {
121+
"type": "number"
122+
}
123+
},
124+
"possible_causes": {
125+
"description": "A list of possible causes or reasons for this challenge to occur",
126+
"type": "array",
127+
"items": {
128+
"type": "string"
129+
}
130+
},
131+
"remediations": {
132+
"description": "Steps we can take to address this challenge",
133+
"type": "array",
134+
"items": {
135+
"type": "string"
136+
}
137+
}
138+
},
139+
"required": ["key_challenge", "affected_issues", "possible_causes", "remediations"]
140+
}
141+
}
142+
},
143+
"required": ["issue_analysis", "open_questions", "actions", "executive_summary"]
144+
}'
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
import logging
2+
from typing import Any, Dict, List, Optional, Union
3+
import yaml
4+
import time
5+
import json
6+
7+
from tqdm import tqdm
8+
from openai import OpenAI
9+
import groq
10+
11+
logger = logging.getLogger(__name__)
12+
logger.addHandler(logging.StreamHandler())
13+
CFG = yaml.safe_load(open("config.yaml", "r"))
14+
15+
class LlamaVLLM():
16+
def __init__(self, endpoint, model_id):
17+
self.model_id = model_id
18+
self.client = OpenAI(base_url=endpoint, api_key='token')
19+
20+
def chat(
21+
self,
22+
inputs: List[Dict[str, str]],
23+
generation_kwargs: Optional[Dict[str, Any]] = None,
24+
guided_decode_json_schema: Optional[str] = None
25+
) -> List[str]:
26+
27+
if generation_kwargs is None:
28+
generation_kwargs = {}
29+
30+
try:
31+
response = self.client.chat.completions.create(
32+
model=self.model,
33+
messages=inputs,
34+
extra_body={
35+
"guided_json": guided_decode_json_schema
36+
},
37+
**generation_kwargs,
38+
)
39+
output = response.choices[0].message
40+
except Exception as e:
41+
logger.error(
42+
f"FAILED to generate inference for input {inputs}\nError: {str(e)}"
43+
)
44+
output = None
45+
return output
46+
47+
48+
class LlamaGroq():
49+
def __init__(self, key, model_id):
50+
self.model_id = model_id
51+
self.client = groq.Groq(api_key=key)
52+
logger.debug(f"Using Groq:{self.model_id} for inference")
53+
54+
def chat(
55+
self,
56+
inputs: List[Dict[str, str]],
57+
generation_kwargs: Optional[Dict[str, Any]] = None,
58+
guided_decode_json_schema: Optional[str] = None
59+
) -> str:
60+
61+
if generation_kwargs is None:
62+
generation_kwargs = {}
63+
64+
# Currently Groq doesn't support guided JSON decoding. Workaround:
65+
if guided_decode_json_schema is not None:
66+
inputs[0]['content'] += f"\n\nEnsure your response aligns with the following JSON schema:\n{guided_decode_json_schema}\n\n"
67+
68+
output = None
69+
70+
while True:
71+
try:
72+
response = self.client.chat.completions.with_raw_response.create(
73+
model=self.model_id,
74+
messages=inputs,
75+
stream=False,
76+
**generation_kwargs,
77+
response_format={"type": 'json_object' if guided_decode_json_schema is not None else 'text'}
78+
)
79+
completion = response.parse()
80+
output = completion.choices[0].message.content
81+
break
82+
except groq.RateLimitError as e:
83+
wait = e.response.headers['X-Ratelimit-Reset']
84+
response = e.response
85+
print(e)
86+
print(f"[groq] waiting for {wait} to prevent ratelimiting")
87+
time.sleep(wait)
88+
except Exception as e:
89+
logger.error(f"INFERENCE FAILED with Error: {e.response.status_code} for input:\n{inputs[-1]['content'][:300]}")
90+
break
91+
92+
return output
93+
94+
95+
def run_llm_inference(
96+
prompt_name: str,
97+
inputs: Union[str, List[str]],
98+
generation_kwargs: Optional[Dict] = None,
99+
guided_decode_json_schema=None,
100+
) -> Union[List[str], List[Dict[str, Any]]]:
101+
"""
102+
Run the LLM inference on the given inputs.
103+
104+
Args:
105+
- prompt_name (str): The name of the prompt to use.
106+
- inputs (str or List[str]): The input(s) to the LLM.
107+
- generation_kwargs (Dict): Additional keyword arguments to pass to the LLM.
108+
- guided_decode_json_schema (str): The JSON schema to use for guided decoding.
109+
110+
Returns:
111+
- Union[str, List[str]]: The response(s) from the LLM.
112+
"""
113+
114+
# initialize appropriate LLM accessor
115+
if CFG['model']['use'] == 'vllm':
116+
LLM = LlamaVLLM(**CFG['model']['vllm'])
117+
elif CFG['model']['use'] == 'groq':
118+
LLM = LlamaGroq(**CFG['model']['groq'])
119+
else:
120+
raise ValueError("Invalid model type in config.yaml")
121+
122+
logger.debug(f"Running `{prompt_name}` inference with {CFG['model']['use']}")
123+
124+
_batch = True
125+
if isinstance(inputs, str):
126+
_batch = False
127+
inputs = [inputs]
128+
129+
inputs = [
130+
[
131+
{"role": "system", "content": CFG["prompts"][prompt_name]["system"]},
132+
{"role": "user", "content": i},
133+
]
134+
for i in inputs
135+
]
136+
137+
if (
138+
guided_decode_json_schema is None
139+
and "json_schema" in CFG["prompts"][prompt_name]
140+
):
141+
guided_decode_json_schema = " ".join(
142+
CFG["prompts"][prompt_name]["json_schema"].split()
143+
)
144+
145+
responses = [
146+
LLM.chat(i, generation_kwargs, guided_decode_json_schema)
147+
for i in tqdm(inputs, desc=f"Inference[{prompt_name}]")
148+
]
149+
150+
if guided_decode_json_schema is not None:
151+
responses_json = []
152+
for r in responses:
153+
if r is not None:
154+
try:
155+
responses_json.append(json.loads(r, strict=False))
156+
continue
157+
except json.JSONDecodeError:
158+
logger.error(f"Error decoding JSON: {r}")
159+
responses_json.append(None)
160+
responses = responses_json
161+
162+
if not _batch:
163+
responses = responses[0]
164+
165+
return responses

0 commit comments

Comments
 (0)