Skip to content

Commit b7b3e96

Browse files
reasoning node refactoring
1 parent 0b12589 commit b7b3e96

File tree

3 files changed

+81
-72
lines changed

3 files changed

+81
-72
lines changed

scrapegraphai/nodes/reasoning_node.py

Lines changed: 7 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,15 @@
1212
from tqdm import tqdm
1313
from .base_node import BaseNode
1414
from ..utils import transform_schema
15+
from ..prompts import (
16+
TEMPLATE_REASONING, TEMPLATE_REASONING_WITH_CONTEXT
17+
)
1518

1619
class ReasoningNode(BaseNode):
1720
"""
18-
...
21+
A node that refine the user prompt with the use of the schema and additional context and
22+
create a precise prompt in subsequent steps that explicitly link elements in the user's
23+
original input to their corresponding representations in the JSON schema.
1924
2025
Attributes:
2126
llm_model: An instance of a language model client, configured for generating answers.
@@ -55,7 +60,7 @@ def __init__(
5560

5661
def execute(self, state: dict) -> dict:
5762
"""
58-
...
63+
Generate a refined prompt for the reasoning task based on the user's input and the JSON schema.
5964
6065
Args:
6166
state (dict): The current state of the graph. The input keys will be used
@@ -70,75 +75,6 @@ def execute(self, state: dict) -> dict:
7075
"""
7176

7277
self.logger.info(f"--- Executing {self.node_name} Node ---")
73-
74-
TEMPLATE_REASONING = """
75-
**Task**: Analyze the user's request and the provided JSON schema to guide an LLM in extracting information directly from HTML.
76-
77-
**User's Request**:
78-
{user_input}
79-
80-
**Target JSON Schema**:
81-
```json
82-
{json_schema}
83-
```
84-
85-
**Analysis Instructions**:
86-
1. **Interpret User Request:**
87-
* Identify the key information types or entities the user is seeking.
88-
* Note any specific attributes, relationships, or constraints mentioned.
89-
90-
2. **Map to JSON Schema**:
91-
* For each identified element in the user request, locate its corresponding field in the JSON schema.
92-
* Explain how the schema structure represents the requested information.
93-
* Highlight any relevant schema elements not explicitly mentioned in the user's request.
94-
95-
3. **Data Transformation Guidance**:
96-
* Provide guidance on any necessary transformations to align extracted data with the JSON schema requirements.
97-
98-
This analysis will be used to instruct an LLM that has the HTML content in its context. The LLM will use this guidance to extract the information and return it directly in the specified JSON format.
99-
100-
**Reasoning Output**:
101-
[Your detailed analysis based on the above instructions]
102-
"""
103-
104-
TEMPLATE_REASONING_WITH_CONTEXT = """
105-
**Task**: Analyze the user's request, provided JSON schema, and additional context to guide an LLM in extracting information directly from HTML.
106-
107-
**User's Request**:
108-
{user_input}
109-
110-
**Target JSON Schema**:
111-
```json
112-
{json_schema}
113-
```
114-
115-
**Additional Context**:
116-
{additional_context}
117-
118-
**Analysis Instructions**:
119-
1. **Interpret User Request and Context:**
120-
* Identify the key information types or entities the user is seeking.
121-
* Note any specific attributes, relationships, or constraints mentioned.
122-
* Incorporate insights from the additional context to refine understanding of the task.
123-
124-
2. **Map to JSON Schema**:
125-
* For each identified element in the user request, locate its corresponding field in the JSON schema.
126-
* Explain how the schema structure represents the requested information.
127-
* Highlight any relevant schema elements not explicitly mentioned in the user's request.
128-
129-
3. **Extraction Strategy**:
130-
* Based on the additional context, suggest specific strategies for locating and extracting the required information from the HTML.
131-
* Highlight any potential challenges or special considerations mentioned in the context.
132-
133-
4. **Data Transformation Guidance**:
134-
* Provide guidance on any necessary transformations to align extracted data with the JSON schema requirements.
135-
* Note any special formatting, validation, or business logic considerations from the additional context.
136-
137-
This analysis will be used to instruct an LLM that has the HTML content in its context. The LLM will use this guidance to extract the information and return it directly in the specified JSON format.
138-
139-
**Reasoning Output**:
140-
[Your detailed analysis based on the above instructions, incorporating insights from the additional context]
141-
"""
14278

14379
user_prompt = state['user_prompt']
14480

scrapegraphai/prompts/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@
1818
TEMPLATE_EXECUTION_ANALYSIS, TEMPLATE_EXECUTION_CODE_GENERATION,
1919
TEMPLATE_VALIDATION_ANALYSIS, TEMPLATE_VALIDATION_CODE_GENERATION,
2020
TEMPLATE_SEMANTIC_COMPARISON, TEMPLATE_SEMANTIC_ANALYSIS,
21-
TEMPLATE_SEMANTIC_CODE_GENERATION)
21+
TEMPLATE_SEMANTIC_CODE_GENERATION)
22+
from .reasoning_node_prompts import TEMPLATE_REASONING, TEMPLATE_REASONING_WITH_CONTEXT
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
"""
2+
Reasoning prompts helper
3+
"""
4+
5+
TEMPLATE_REASONING = """
6+
**Task**: Analyze the user's request and the provided JSON schema to guide an LLM in extracting information directly from HTML.
7+
8+
**User's Request**:
9+
{user_input}
10+
11+
**Target JSON Schema**:
12+
```json
13+
{json_schema}
14+
```
15+
16+
**Analysis Instructions**:
17+
1. **Interpret User Request:**
18+
* Identify the key information types or entities the user is seeking.
19+
* Note any specific attributes, relationships, or constraints mentioned.
20+
21+
2. **Map to JSON Schema**:
22+
* For each identified element in the user request, locate its corresponding field in the JSON schema.
23+
* Explain how the schema structure represents the requested information.
24+
* Highlight any relevant schema elements not explicitly mentioned in the user's request.
25+
26+
3. **Data Transformation Guidance**:
27+
* Provide guidance on any necessary transformations to align extracted data with the JSON schema requirements.
28+
29+
This analysis will be used to instruct an LLM that has the HTML content in its context. The LLM will use this guidance to extract the information and return it directly in the specified JSON format.
30+
31+
**Reasoning Output**:
32+
[Your detailed analysis based on the above instructions]
33+
"""
34+
35+
TEMPLATE_REASONING_WITH_CONTEXT = """
36+
**Task**: Analyze the user's request, provided JSON schema, and additional context to guide an LLM in extracting information directly from HTML.
37+
38+
**User's Request**:
39+
{user_input}
40+
41+
**Target JSON Schema**:
42+
```json
43+
{json_schema}
44+
```
45+
46+
**Additional Context**:
47+
{additional_context}
48+
49+
**Analysis Instructions**:
50+
1. **Interpret User Request and Context:**
51+
* Identify the key information types or entities the user is seeking.
52+
* Note any specific attributes, relationships, or constraints mentioned.
53+
* Incorporate insights from the additional context to refine understanding of the task.
54+
55+
2. **Map to JSON Schema**:
56+
* For each identified element in the user request, locate its corresponding field in the JSON schema.
57+
* Explain how the schema structure represents the requested information.
58+
* Highlight any relevant schema elements not explicitly mentioned in the user's request.
59+
60+
3. **Extraction Strategy**:
61+
* Based on the additional context, suggest specific strategies for locating and extracting the required information from the HTML.
62+
* Highlight any potential challenges or special considerations mentioned in the context.
63+
64+
4. **Data Transformation Guidance**:
65+
* Provide guidance on any necessary transformations to align extracted data with the JSON schema requirements.
66+
* Note any special formatting, validation, or business logic considerations from the additional context.
67+
68+
This analysis will be used to instruct an LLM that has the HTML content in its context. The LLM will use this guidance to extract the information and return it directly in the specified JSON format.
69+
70+
**Reasoning Output**:
71+
[Your detailed analysis based on the above instructions, incorporating insights from the additional context]
72+
"""

0 commit comments

Comments
 (0)