Skip to content

Commit 48b5da9

Browse files
author
roy
committed
Enhance prompts and input structure for improved analysis; clarify segment ID extraction and relevance guidelines. Update test input to include additional segment IDs and user input fields for better context. Refactor utils for cleaner Directus client initialization.
1 parent fcdb650 commit 48b5da9

File tree

3 files changed

+32
-8
lines changed

3 files changed

+32
-8
lines changed

prompts.py

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -268,9 +268,17 @@
268268
269269
**references**: ARRAY
270270
For each data segment that supports your analysis, provide:
271-
- **segment_id**: int - The numerical identifier of the data segment (must be accurate)
271+
- **segment_id**: int - The numerical identifier of the data segment (must be accurate). Each segment_id looks like "SEGMENT_ID_<number>" in the input data - extract only the number portion (e.g., from "SEGMENT_ID_123" use 123)
272272
- **description**: string - Explain how this segment contributes to your overall analysis and its specific relevance to the topic
273273
274+
## Critical Reference Guidelines
275+
- **ONLY include segment IDs that are explicitly mentioned in the input data with the format "SEGMENT_ID_<number>"**
276+
- **ONLY include segments that directly support claims, insights, or evidence in your analysis**
277+
- **DO NOT include a segment reference unless you can clearly explain its specific relevance to your findings**
278+
- **Extract the numeric ID correctly**: From "SEGMENT_ID_123" use 123, from "SEGMENT_ID_456" use 456
279+
- **Quality over quantity**: It's better to have fewer, highly relevant references than many irrelevant ones
280+
- **Verify relevance**: Each reference must correspond to content you actually analyzed and cited in your summary
281+
274282
## Quality Standards
275283
- **Depth**: Provide comprehensive analysis that goes beyond surface-level summarization
276284
- **Variety**: Use diverse language and avoid repetitive phrases
@@ -348,7 +356,7 @@
348356
- Explain what the investigation covers and why it matters
349357
- Should orient readers to the scope and purpose of the multi-aspect analysis
350358
351-
**summary**: string (4-6 paragraphs with markdown formatting)
359+
**summary**: string (2-3 paragraphs with markdown formatting)
352360
- Develop an in-depth, multi-section analysis with proper markdown formatting
353361
- Include clear subsections with descriptive headings
354362
- Present findings in logical progression from key insights to supporting details
@@ -451,8 +459,17 @@
451459
- Address the broader implications and significance of the findings
452460
453461
### Segments
454-
- For each relevant document summary, provide accurate segment_id and description
455-
- Explain how each segment contributes to the overall analysis and its specific relevance
462+
For each data segment that supports your analysis, provide:
463+
- **segment_id**: int - The numerical identifier of the data segment (must be accurate). Each segment_id looks like "SEGMENT_ID_<number>" in the input data - extract only the number portion (e.g., from "SEGMENT_ID_123" use 123)
464+
- **description**: string - Explain how this segment contributes to your overall analysis and its specific relevance to the topic
465+
466+
## Critical Segment Reference Guidelines
467+
- **ONLY include segment IDs that are explicitly mentioned in the document summaries with the format "SEGMENT_ID_<number>"**
468+
- **Extract the numeric ID correctly**: From "SEGMENT_ID_123" use 123, from "SEGMENT_ID_456" use 456
469+
- **ONLY include segments that directly support claims, insights, or evidence in your analysis**
470+
- **DO NOT include a segment reference unless you can clearly explain its specific relevance to your findings**
471+
- **Quality over quantity**: It's better to have fewer, highly relevant references than many irrelevant ones
472+
- **Verify relevance**: Each reference must correspond to content you actually analyzed and cited in your summary
456473
457474
## Quality Standards
458475
- **Depth**: Provide comprehensive analysis that goes beyond surface-level summarization

test_input.json

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
{
22
"input": {
33
"response_language": "en",
4-
"segment_ids": [1,2,3,4],
5-
"user_prompt": "Please summarise all the topics.",
6-
"project_analysis_run_id": "1b15b167-166c-4c0e-8fb9-c3bf5d930f3e"
4+
"segment_ids": [1,2,3,4,5,6,7],
5+
"user_input": "Please summarise all the topics.",
6+
"user_input_description": "Please summarise all the topics.",
7+
"project_analysis_run_id": "39742451-b083-4c3e-a214-4431cce3957b"
78
}
89
}
910

utils.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# [ ] TODO: Add retry logic in rag calls
2+
# [ ] TODO: Check why user_input and user_input_description are not being populated in directus
3+
# [ ] TODO: Change backend of echo to respond only with data not prompt
4+
15
import os
26
import json
37
import uuid
@@ -45,7 +49,9 @@
4549
DIRECTUS_USERNAME = str(os.getenv("DIRECTUS_USERNAME"))
4650
DIRECTUS_PASSWORD = str(os.getenv("DIRECTUS_PASSWORD"))
4751

48-
directus = DirectusClient(url=DIRECTUS_BASE_URL, email=DIRECTUS_USERNAME, password=DIRECTUS_PASSWORD)
52+
directus = DirectusClient(
53+
url=DIRECTUS_BASE_URL, email=DIRECTUS_USERNAME, password=DIRECTUS_PASSWORD
54+
)
4955

5056

5157
def generate_uuid() -> str:

0 commit comments

Comments
 (0)