Skip to content

Commit 79033bb

Browse files
committed
Add analyze_and_label link documentation with config and usage examples
1 parent ba794f0 commit 79033bb

File tree

1 file changed

+111
-0
lines changed

1 file changed

+111
-0
lines changed
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Analyze and Label Link
2+
3+
## Overview
4+
5+
The `analyze_and_label` link is a powerful component of the vCon server that automatically analyzes dialog content and generates relevant labels/tags for categorization. It uses OpenAI's language models to process various dialog formats (transcripts, messages, chats, emails) and extract meaningful labels that are then applied as tags to the vCon.
6+
7+
## How It Works
8+
9+
1. The link retrieves a vCon from Redis storage
10+
2. For each dialog in the vCon, it checks if a source analysis (typically of type "transcript") is present
11+
3. It extracts the text content from the source analysis (from the specified location in the configuration)
12+
4. It sends the text to OpenAI's API with a customizable prompt
13+
5. It processes the API response to extract labels
14+
6. It adds the analysis as a new analysis object to the vCon
15+
7. It applies each extracted label as a tag to the vCon
16+
17+
## Supported Dialog Formats
18+
19+
The link is designed to handle various text formats that might appear in dialogs, including:
20+
21+
- **Standard Transcripts**: Plain text transcripts of conversations
22+
- **Email Format**: Text with headers, subject, body, etc.
23+
- **Chat Format**: Text with timestamps and speaker identification
24+
- **Message Format**: Text with headers and body
25+
26+
The link is able to intelligently process these different formats and extract appropriate labels regardless of the format.
27+
28+
## Configuration Options
29+
30+
The link accepts the following configuration options:
31+
32+
| Option | Description | Default |
33+
|--------|-------------|--------|
34+
| `prompt` | The prompt sent to OpenAI for analysis | "Analyze this transcript and provide a list of relevant labels for categorization..." |
35+
| `analysis_type` | The type assigned to the analysis output | "labeled_analysis" |
36+
| `model` | The OpenAI model to use | "gpt-4-turbo" |
37+
| `sampling_rate` | Rate at which to run the analysis (1 = 100%, 0.5 = 50%, etc.) | 1 |
38+
| `temperature` | The temperature parameter for the OpenAI API | 0.2 |
39+
| `source.analysis_type` | The type of analysis to use as source | "transcript" |
40+
| `source.text_location` | The JSON path to the text within the source analysis | "body.paragraphs.transcript" |
41+
| `response_format` | Format specification for the OpenAI API response | `{"type": "json_object"}` |
42+
| `OPENAI_API_KEY` | The OpenAI API key (required but not defined in defaults) | None |
43+
44+
## Usage Example
45+
46+
```python
47+
from server.links.analyze_and_label import run
48+
49+
# Run with default options (requires OPENAI_API_KEY in the options)
50+
run(
51+
vcon_uuid="your-vcon-uuid",
52+
link_name="analyze_and_label",
53+
opts={
54+
"OPENAI_API_KEY": "your-openai-api-key",
55+
# Optionally override other defaults
56+
"prompt": "Identify key topics, sentiments, and issues in this conversation. Return your response as a JSON object with a single key 'labels' containing an array of strings.",
57+
"model": "gpt-3.5-turbo"
58+
}
59+
)
60+
```
61+
62+
## Customizing Label Generation
63+
64+
You can customize the label generation process by modifying the `prompt` parameter. The prompt should instruct the model to return labels in a specific format - a JSON object with a "labels" key containing an array of strings.
65+
66+
Example specialized prompts:
67+
68+
- **Support Issues**: "Analyze this transcript and identify the specific support issues mentioned. Return your response as a JSON object with a single key 'labels' containing an array of issue categories."
69+
- **Sentiment Analysis**: "Analyze this conversation and identify the customer's sentiments and emotional states. Return your response as a JSON object with a single key 'labels' containing an array of sentiment descriptors."
70+
- **Product Mentions**: "Identify all products or services mentioned in this transcript. Return your response as a JSON object with a single key 'labels' containing an array of product names."
71+
72+
## Error Handling
73+
74+
The link includes robust error handling:
75+
76+
- Exponential backoff retry mechanism for API calls
77+
- JSON parsing error handling
78+
- Logging of errors and performance metrics
79+
80+
## Testing
81+
82+
The link includes comprehensive tests for all functionality. To run the tests with actual OpenAI API calls (optional):
83+
84+
```bash
85+
# Set environment variables
86+
export OPENAI_API_KEY="your-api-key"
87+
export RUN_OPENAI_ANALYZE_LABEL_TESTS=1
88+
89+
# Run the tests
90+
pytest server/links/analyze_and_label/tests/test_analyze_and_label.py
91+
```
92+
93+
Without setting `RUN_OPENAI_ANALYZE_LABEL_TESTS=1`, tests will run with mocked API responses.
94+
95+
## Metrics and Monitoring
96+
97+
The link emits several metrics for monitoring:
98+
99+
- `conserver.link.openai.labels_added`: Number of labels added per run
100+
- `conserver.link.openai.analysis_time`: Time taken for analysis
101+
- `conserver.link.openai.json_parse_failures`: Count of JSON parsing failures
102+
- `conserver.link.openai.analysis_failures`: Count of overall analysis failures
103+
104+
## Integration with vCon Structure
105+
106+
The link integrates with the vCon structure in two ways:
107+
108+
1. It adds a new analysis object with the `labeled_analysis` type (or the configured type)
109+
2. It adds tags to the vCon based on the extracted labels
110+
111+
This allows for both structured access to the full analysis and quick filtering/categorization using the applied tags.

0 commit comments

Comments
 (0)