Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions ai/generative-ai-service/sentiment+categorization/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Copyright (c) 2025 Oracle and/or its affiliates.

The Universal Permissive License (UPL), Version 1.0

Subject to the condition set forth below, permission is hereby granted to any
person obtaining a copy of this software, associated documentation and/or data
(collectively the "Software"), free of charge and under any and all copyright
rights in the Software, and any and all patent rights owned or freely
licensable by each licensor hereunder covering either (i) the unmodified
Software as contributed to or provided by such licensor, or (ii) the Larger
Works (as defined below), to deal in both

(a) the Software, and
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
one is included with the Software (each a "Larger Work" to which the Software
is contributed by such licensors),

without restriction, including without limitation the rights to copy, create
derivative works of, display, perform, and distribute the Software and make,
use, sell, offer for sale, import, export, have made, and have sold the
Software and the Larger Work(s), and to sublicense the foregoing rights on
either these or other terms.

This license is subject to the following condition:
The above copyright notice and either this complete permission notice or at
a minimum a reference to the UPL must be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
32 changes: 32 additions & 0 deletions ai/generative-ai-service/sentiment+categorization/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Customer Message Analyzer

The Customer Message Analyzer is a tool designed to analyze customer messages through unsupervised categorization, sentiment analysis, and summary reporting. It helps businesses understand customer feedback without requiring extensive manual labeling or analysis.


Reviewed: 01.04.2025

# When to use this asset?

Customer service teams, product managers, and marketing professionals would use this asset when they need to quickly understand large volumes of customer feedback, identify trends, and make data-driven decisions to improve products or services.

# How to use this asset?

To use the Customer Message Analyzer, follow these steps:

1. Input the customer messages into the system.
2. The system will automatically cluster the messages into categories based on their content.
3. Each message will receive a sentiment score indicating its emotional tone.
4. Review the generated summary report highlighting dominant themes, sentiment trends, and actionable insights.

# Useful Links (Optional)

- [Confluence](https://confluence.oraclecorp.com/confluence/x/DaCEoAE)
- Internal Reusable Assets

# License

Copyright (c) 2025 Oracle and/or its affiliates.

Licensed under the Universal Permissive License (UPL), Version 1.0.

See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
54 changes: 54 additions & 0 deletions ai/generative-ai-service/sentiment+categorization/files/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Batch Message Analysis and Categorization Demo
This demo showcases an AI-powered solution for analyzing batches of customer messages, categorizing them into hierarchical levels, extracting sentiment scores, and generating structured reports.

## Key Features
* **Hierarchical Categorization**: Automatically categorizes messages into three levels of hierarchy:
+ Primary Category: High-level categorization
+ Secondary Category: Mid-level categorization, building upon primary categories
+ Tertiary Category: Low-level categorization, providing increased specificity and detail
* **Sentiment Analysis**: Extracts sentiment scores for each message, ranging from very negative (1) to very positive (10)
* **Structured Reporting**: Generates a comprehensive report analyzing the batch of messages, including:
+ Category distribution across all three levels
+ Sentiment score distribution
+ Summaries of key findings and insights

## Data Requirements
* Customer messages should be stored in a CSV file(s) within a folder named `data`.
* Each CSV file should contain a column with the message text.

## Getting Started
To run the demo, follow these steps:
1. Clone the repository using `git clone`.
2. Place your CSV files containing customer messages in the `data` folder.
3. Install dependencies using `pip install -r requirements.txt`.
4. Run the application using `streamlit run app.py`.

## Example Use Cases
* Analyze customer feedback from surveys, reviews, or social media platforms to identify trends and patterns.
* Inform product development and customer support strategies by understanding customer sentiment and preferences.
* Optimize marketing campaigns by targeting specific customer segments based on their interests and concerns.

## Technical Details
* The solution leverages Oracle Cloud Infrastructure (OCI) GenAI, a suite of AI services designed to simplify AI adoption.
* Specifically, this demo utilizes the Cohere R+ model, a state-of-the-art language model optimized for natural language processing tasks.
* All aspects of the demo, including:
+ Hierarchical categorization
+ Sentiment analysis
+ Structured report generation
are powered by GenAI, ensuring accurate and efficient analysis of customer messages.

## Output
The demo will display an interactive dashboard with the generated report, providing valuable insights into customer messages, including:
* Category distribution across all three levels
* Sentiment score distribution
* Summaries of key findings and insights

## Contributing
We welcome contributions to improve and expand the capabilities of this demo. Please fork the repository and submit a pull request with your changes.

## License
Copyright (c) 2025 Oracle and/or its affiliates.

Licensed under the Universal Permissive License (UPL), Version 1.0.

See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
16 changes: 16 additions & 0 deletions ai/generative-ai-service/sentiment+categorization/files/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import streamlit as st

st.set_page_config(
page_title="Hello",
page_icon="👋",
)

st.write("# Welcome to Streamlit! 👋")

st.sidebar.success("Select a demo above.")

st.markdown(
"""
This is a demo!
"""
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
ID,Message
1,I had to cancel my order because of poor service.
2,"The delivery was late, and the packaging was damaged."
3,I was sent the wrong color of the product.
4,My order was incomplete when it arrived.
5,The product I received was damaged.
6,The quality of the product is much worse than expected.
7,The product stopped working after a short period of time.
8,The product doesn’t match the description on the website.
9,I’ve had to contact customer service multiple times for the same issue.
10,Customer support was not helpful at all.
11,The quality of the product was poor.
12,The product was much smaller than I expected.
13,I had trouble finding the product on your website.
14,The instructions were unclear and hard to follow.
15,The website was difficult to navigate during my purchase.
16,I received the wrong size and need a replacement.
17,I was given false information about the product.
18,The product stopped working after a short period of time.
19,The product arrived damaged and unusable.
20,The product arrived in terrible condition.
21,The product arrived damaged and unusable.
22,The customer service was slow to respond.
23,The product was missing some essential accessories.
24,I didn’t receive any confirmation email for my order.
25,The product wasn’t compatible with my other appliances.
26,The product is faulty and doesn’t work properly.
27,The product didn’t fit as expected.
28,The product was extremely hard to set up.
29,I am unhappy with the design of the product.
30,The website was difficult to navigate during my purchase.
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
import json
import logging
from typing import List

from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI
from langchain_community.embeddings import OCIGenAIEmbeddings
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.pydantic_v1 import BaseModel
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, StateGraph

import backend.message_handler as handler
import backend.utils.llm_config as llm_config

# Set up logging
logging.getLogger("oci").setLevel(logging.DEBUG)
messages_path = "ai/generative-ai-service/sentiment+categorization/demo_code/backend/data/complaints_messages.csv"


class AgentState(BaseModel):
messages_info: List = []
categories: List = []
reports: List = []


class FeedbackAgent:
def __init__(self, model_name: str = "cohere_oci"):
self.model_name = model_name
self.model = self.initialize_model()
self.memory = MemorySaver()
self.builder = self.setup_graph()
self.messages = self.read_messages()

def initialize_model(self):
if self.model_name not in llm_config.MODEL_REGISTRY:
raise ValueError(f"Unknown model: {self.model_name}")

model_config = llm_config.MODEL_REGISTRY[self.model_name]

return ChatOCIGenAI(
model_id=model_config["model_id"],
service_endpoint=model_config["service_endpoint"],
compartment_id=model_config["compartment_id"],
provider=model_config["provider"],
auth_type=model_config["auth_type"],
auth_profile=model_config["auth_profile"],
model_kwargs=model_config["model_kwargs"],
)

def read_messages(self):
messages = handler.read_messages(filepath=messages_path)
return handler.batchify(messages, 30)

def summarization_node(self, state: AgentState):
batch = self.messages
response = self.model.invoke(
[
SystemMessage(
content=llm_config.get_prompt(self.model_name, "SUMMARIZATION")
),
HumanMessage(content=f"Message batch: {batch}"),
]
)
state.messages_info = state.messages_info + [json.loads(response.content)]
return {"messages_info": state.messages_info}

def categorization_node(self, state: AgentState):
batch = state.messages_info
response = self.model.invoke(
[
SystemMessage(
content=llm_config.get_prompt(
self.model_name, "CATEGORIZATION_SYSTEM"
)
),
HumanMessage(
content=llm_config.get_prompt(
self.model_name, "CATEGORIZATION_USER"
).format(MESSAGE_BATCH=batch)
),
]
)
content = [json.loads(response.content)]
state.categories = state.categories + handler.match_categories(batch, content)
return {"categories": state.categories}

def generate_report_node(self, state: AgentState):
response = self.model.invoke(
[
SystemMessage(
content=llm_config.get_prompt(self.model_name, "REPORT_GEN")
),
HumanMessage(content=f"Message info: {state.categories}"),
]
)
state.reports = response.content
return {"reports": [response.content]}

def setup_graph(self):
builder = StateGraph(AgentState)
builder.add_node("summarize", self.summarization_node)
builder.add_node("categorize", self.categorization_node)
builder.add_node("generate_report", self.generate_report_node)

builder.set_entry_point("summarize")
builder.add_edge("summarize", "categorize")
builder.add_edge("categorize", "generate_report")

builder.add_edge("generate_report", END)
return builder.compile(checkpointer=self.memory)

def get_graph(self):
return self.builder.get_graph()

def run(self):
thread = {"configurable": {"thread_id": "1"}}
for s in self.builder.stream(
config=thread,
):
print(f"\n \n{s}")

def run_step_by_step(self):
thread = {"configurable": {"thread_id": "1"}}
initial_state = {
"messages_info": [],
"categories": [],
"reports": [],
}
for state in self.builder.stream(initial_state, thread):
yield state # Yield each intermediate step to allow step-by-step execution
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from backend.feedback_agent import FeedbackAgent


class FeedbackAgentWrapper:
def __init__(self):
self.agent = FeedbackAgent()
self.run_graph = self.agent.run_step_by_step()

def get_nodes_edges(self):
graph_data = self.agent.get_graph()
nodes = list(graph_data.nodes.keys())
edges = [(edge.source, edge.target) for edge in graph_data.edges]
return nodes, edges

def run_step_by_step(self):
try:
action_output = next(self.run_graph)
current_node = list(action_output.keys())[0]
except StopIteration:
action_output = {}
current_node = "FINALIZED"
return current_node, action_output

def get_graph(self):
return self.agent.get_graph()
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
import csv
from typing import List


def read_messages(
filepath: str, columns: List[str] = ["ID", "Message"]
) -> List[List[str]]:
with open(filepath, newline="", encoding="utf-8") as file:
reader = csv.DictReader(file)
extracted_data = []

for row in reader:
extracted_row = [row[col] for col in columns if col in row]
extracted_data.append(extracted_row)

return extracted_data


def batchify(lst, batch_size):
return [lst[i : i + batch_size] for i in range(0, len(lst), batch_size)]


def match_categories(summaries, categories):
result = []
for i, elem in enumerate(summaries[0]):
if elem["id"] == categories[0][i]["id"]:
elem["primary_category"] = categories[0][i]["primary_category"]
elem["secondary_category"] = categories[0][i]["secondary_category"]
elem["tertiary_category"] = categories[0][i]["tertiary_category"]
result.append(elem)
return result


def group_by_category_level(categories_list):
result = {}

for category in categories_list:
primary = category["primary_category"]
secondary = category["secondary_category"]
tertiary = category["tertiary_category"]

if primary not in result:
result[primary] = {}

if secondary not in result[primary]:
result[primary][secondary] = {}

if tertiary not in result[primary][secondary]:
result[primary][secondary][tertiary] = []

result[primary][secondary][tertiary].append(category["id"])

return result
Loading