Skip to content

Commit feacfa2

Browse files
authored
Merge pull request #1723 from ysocarras-oracle/sentiment+category-analysis
Sentiment+category analysis
2 parents 73476e0 + bcbca77 commit feacfa2

File tree

12 files changed

+924
-0
lines changed

12 files changed

+924
-0
lines changed
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2025 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Customer Message Analyzer
2+
3+
The Customer Message Analyzer is a tool designed to analyze customer messages through unsupervised categorization, sentiment analysis, and summary reporting. It helps businesses understand customer feedback without requiring extensive manual labeling or analysis.
4+
5+
6+
Reviewed: 01.04.2025
7+
8+
# When to use this asset?
9+
10+
Customer service teams, product managers, and marketing professionals would use this asset when they need to quickly understand large volumes of customer feedback, identify trends, and make data-driven decisions to improve products or services.
11+
12+
# How to use this asset?
13+
14+
To use the Customer Message Analyzer, follow these steps:
15+
16+
1. Input the customer messages into the system.
17+
2. The system will automatically cluster the messages into categories based on their content.
18+
3. Each message will receive a sentiment score indicating its emotional tone.
19+
4. Review the generated summary report highlighting dominant themes, sentiment trends, and actionable insights.
20+
21+
# Useful Links (Optional)
22+
23+
- [Confluence](https://confluence.oraclecorp.com/confluence/x/DaCEoAE)
24+
- Internal Reusable Assets
25+
26+
# License
27+
28+
Copyright (c) 2025 Oracle and/or its affiliates.
29+
30+
Licensed under the Universal Permissive License (UPL), Version 1.0.
31+
32+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Batch Message Analysis and Categorization Demo
2+
This demo showcases an AI-powered solution for analyzing batches of customer messages, categorizing them into hierarchical levels, extracting sentiment scores, and generating structured reports.
3+
4+
## Key Features
5+
* **Hierarchical Categorization**: Automatically categorizes messages into three levels of hierarchy:
6+
+ Primary Category: High-level categorization
7+
+ Secondary Category: Mid-level categorization, building upon primary categories
8+
+ Tertiary Category: Low-level categorization, providing increased specificity and detail
9+
* **Sentiment Analysis**: Extracts sentiment scores for each message, ranging from very negative (1) to very positive (10)
10+
* **Structured Reporting**: Generates a comprehensive report analyzing the batch of messages, including:
11+
+ Category distribution across all three levels
12+
+ Sentiment score distribution
13+
+ Summaries of key findings and insights
14+
15+
## Data Requirements
16+
* Customer messages should be stored in a CSV file(s) within a folder named `data`.
17+
* Each CSV file should contain a column with the message text.
18+
19+
## Getting Started
20+
To run the demo, follow these steps:
21+
1. Clone the repository using `git clone`.
22+
2. Place your CSV files containing customer messages in the `data` folder.
23+
3. Install dependencies using `pip install -r requirements.txt`.
24+
4. Run the application using `streamlit run app.py`.
25+
26+
## Example Use Cases
27+
* Analyze customer feedback from surveys, reviews, or social media platforms to identify trends and patterns.
28+
* Inform product development and customer support strategies by understanding customer sentiment and preferences.
29+
* Optimize marketing campaigns by targeting specific customer segments based on their interests and concerns.
30+
31+
## Technical Details
32+
* The solution leverages Oracle Cloud Infrastructure (OCI) GenAI, a suite of AI services designed to simplify AI adoption.
33+
* Specifically, this demo utilizes the Cohere R+ model, a state-of-the-art language model optimized for natural language processing tasks.
34+
* All aspects of the demo, including:
35+
+ Hierarchical categorization
36+
+ Sentiment analysis
37+
+ Structured report generation
38+
are powered by GenAI, ensuring accurate and efficient analysis of customer messages.
39+
40+
## Output
41+
The demo will display an interactive dashboard with the generated report, providing valuable insights into customer messages, including:
42+
* Category distribution across all three levels
43+
* Sentiment score distribution
44+
* Summaries of key findings and insights
45+
46+
## Contributing
47+
We welcome contributions to improve and expand the capabilities of this demo. Please fork the repository and submit a pull request with your changes.
48+
49+
## License
50+
Copyright (c) 2025 Oracle and/or its affiliates.
51+
52+
Licensed under the Universal Permissive License (UPL), Version 1.0.
53+
54+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import streamlit as st
2+
3+
st.set_page_config(
4+
page_title="Hello",
5+
page_icon="👋",
6+
)
7+
8+
st.write("# Welcome to Streamlit! 👋")
9+
10+
st.sidebar.success("Select a demo above.")
11+
12+
st.markdown(
13+
"""
14+
This is a demo!
15+
"""
16+
)
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
ID,Message
2+
1,I had to cancel my order because of poor service.
3+
2,"The delivery was late, and the packaging was damaged."
4+
3,I was sent the wrong color of the product.
5+
4,My order was incomplete when it arrived.
6+
5,The product I received was damaged.
7+
6,The quality of the product is much worse than expected.
8+
7,The product stopped working after a short period of time.
9+
8,The product doesn’t match the description on the website.
10+
9,I’ve had to contact customer service multiple times for the same issue.
11+
10,Customer support was not helpful at all.
12+
11,The quality of the product was poor.
13+
12,The product was much smaller than I expected.
14+
13,I had trouble finding the product on your website.
15+
14,The instructions were unclear and hard to follow.
16+
15,The website was difficult to navigate during my purchase.
17+
16,I received the wrong size and need a replacement.
18+
17,I was given false information about the product.
19+
18,The product stopped working after a short period of time.
20+
19,The product arrived damaged and unusable.
21+
20,The product arrived in terrible condition.
22+
21,The product arrived damaged and unusable.
23+
22,The customer service was slow to respond.
24+
23,The product was missing some essential accessories.
25+
24,I didn’t receive any confirmation email for my order.
26+
25,The product wasn’t compatible with my other appliances.
27+
26,The product is faulty and doesn’t work properly.
28+
27,The product didn’t fit as expected.
29+
28,The product was extremely hard to set up.
30+
29,I am unhappy with the design of the product.
31+
30,The website was difficult to navigate during my purchase.
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
import json
2+
import logging
3+
from typing import List
4+
5+
from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI
6+
from langchain_core.messages import HumanMessage, SystemMessage
7+
from langchain_core.pydantic_v1 import BaseModel
8+
from langgraph.checkpoint.memory import MemorySaver
9+
from langgraph.graph import END, StateGraph
10+
11+
import backend.message_handler as handler
12+
import backend.utils.llm_config as llm_config
13+
14+
# Set up logging
15+
logging.getLogger("oci").setLevel(logging.DEBUG)
16+
messages_path = "ai/generative-ai-service/sentiment+categorization/demo_code/backend/data/complaints_messages.csv"
17+
18+
19+
class AgentState(BaseModel):
20+
messages_info: List = []
21+
categories: List = []
22+
reports: List = []
23+
24+
25+
class FeedbackAgent:
26+
def __init__(self, model_name: str = "cohere_oci"):
27+
self.model_name = model_name
28+
self.model = self.initialize_model()
29+
self.memory = MemorySaver()
30+
self.builder = self.setup_graph()
31+
self.messages = self.read_messages()
32+
33+
def initialize_model(self):
34+
if self.model_name not in llm_config.MODEL_REGISTRY:
35+
raise ValueError(f"Unknown model: {self.model_name}")
36+
37+
model_config = llm_config.MODEL_REGISTRY[self.model_name]
38+
39+
return ChatOCIGenAI(
40+
model_id=model_config["model_id"],
41+
service_endpoint=model_config["service_endpoint"],
42+
compartment_id=model_config["compartment_id"],
43+
provider=model_config["provider"],
44+
auth_type=model_config["auth_type"],
45+
auth_profile=model_config["auth_profile"],
46+
model_kwargs=model_config["model_kwargs"],
47+
)
48+
49+
def read_messages(self):
50+
messages = handler.read_messages(filepath=messages_path)
51+
return handler.batchify(messages, 30)
52+
53+
def summarization_node(self, state: AgentState):
54+
batch = self.messages
55+
response = self.model.invoke(
56+
[
57+
SystemMessage(
58+
content=llm_config.get_prompt(self.model_name, "SUMMARIZATION")
59+
),
60+
HumanMessage(content=f"Message batch: {batch}"),
61+
]
62+
)
63+
state.messages_info = state.messages_info + [json.loads(response.content)]
64+
return {"messages_info": state.messages_info}
65+
66+
def categorization_node(self, state: AgentState):
67+
batch = state.messages_info
68+
response = self.model.invoke(
69+
[
70+
SystemMessage(
71+
content=llm_config.get_prompt(
72+
self.model_name, "CATEGORIZATION_SYSTEM"
73+
)
74+
),
75+
HumanMessage(
76+
content=llm_config.get_prompt(
77+
self.model_name, "CATEGORIZATION_USER"
78+
).format(MESSAGE_BATCH=batch)
79+
),
80+
]
81+
)
82+
content = [json.loads(response.content)]
83+
state.categories = state.categories + handler.match_categories(batch, content)
84+
return {"categories": state.categories}
85+
86+
def generate_report_node(self, state: AgentState):
87+
response = self.model.invoke(
88+
[
89+
SystemMessage(
90+
content=llm_config.get_prompt(self.model_name, "REPORT_GEN")
91+
),
92+
HumanMessage(content=f"Message info: {state.categories}"),
93+
]
94+
)
95+
state.reports = response.content
96+
return {"reports": [response.content]}
97+
98+
def setup_graph(self):
99+
builder = StateGraph(AgentState)
100+
builder.add_node("summarize", self.summarization_node)
101+
builder.add_node("categorize", self.categorization_node)
102+
builder.add_node("generate_report", self.generate_report_node)
103+
104+
builder.set_entry_point("summarize")
105+
builder.add_edge("summarize", "categorize")
106+
builder.add_edge("categorize", "generate_report")
107+
108+
builder.add_edge("generate_report", END)
109+
return builder.compile(checkpointer=self.memory)
110+
111+
def get_graph(self):
112+
return self.builder.get_graph()
113+
114+
def run(self):
115+
thread = {"configurable": {"thread_id": "1"}}
116+
for s in self.builder.stream(
117+
config=thread,
118+
):
119+
print(f"\n \n{s}")
120+
121+
def run_step_by_step(self):
122+
thread = {"configurable": {"thread_id": "1"}}
123+
initial_state = {
124+
"messages_info": [],
125+
"categories": [],
126+
"reports": [],
127+
}
128+
for state in self.builder.stream(initial_state, thread):
129+
yield state # Yield each intermediate step to allow step-by-step execution
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
from backend.feedback_agent import FeedbackAgent
2+
3+
4+
class FeedbackAgentWrapper:
5+
def __init__(self):
6+
self.agent = FeedbackAgent()
7+
self.run_graph = self.agent.run_step_by_step()
8+
9+
def get_nodes_edges(self):
10+
graph_data = self.agent.get_graph()
11+
nodes = list(graph_data.nodes.keys())
12+
edges = [(edge.source, edge.target) for edge in graph_data.edges]
13+
return nodes, edges
14+
15+
def run_step_by_step(self):
16+
try:
17+
action_output = next(self.run_graph)
18+
current_node = list(action_output.keys())[0]
19+
except StopIteration:
20+
action_output = {}
21+
current_node = "FINALIZED"
22+
return current_node, action_output
23+
24+
def get_graph(self):
25+
return self.agent.get_graph()
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import csv
2+
from typing import List
3+
4+
5+
def read_messages(
6+
filepath: str, columns: List[str] = ["ID", "Message"]
7+
) -> List[List[str]]:
8+
with open(filepath, newline="", encoding="utf-8") as file:
9+
reader = csv.DictReader(file)
10+
extracted_data = []
11+
12+
for row in reader:
13+
extracted_row = [row[col] for col in columns if col in row]
14+
extracted_data.append(extracted_row)
15+
16+
return extracted_data
17+
18+
19+
def batchify(lst, batch_size):
20+
return [lst[i : i + batch_size] for i in range(0, len(lst), batch_size)]
21+
22+
23+
def match_categories(summaries, categories):
24+
result = []
25+
for i, elem in enumerate(summaries[0]):
26+
if elem["id"] == categories[0][i]["id"]:
27+
elem["primary_category"] = categories[0][i]["primary_category"]
28+
elem["secondary_category"] = categories[0][i]["secondary_category"]
29+
elem["tertiary_category"] = categories[0][i]["tertiary_category"]
30+
result.append(elem)
31+
return result
32+
33+
34+
def group_by_category_level(categories_list):
35+
result = {}
36+
37+
for category in categories_list:
38+
primary = category["primary_category"]
39+
secondary = category["secondary_category"]
40+
tertiary = category["tertiary_category"]
41+
42+
if primary not in result:
43+
result[primary] = {}
44+
45+
if secondary not in result[primary]:
46+
result[primary][secondary] = {}
47+
48+
if tertiary not in result[primary][secondary]:
49+
result[primary][secondary][tertiary] = []
50+
51+
result[primary][secondary][tertiary].append(category["id"])
52+
53+
return result

0 commit comments

Comments
 (0)