Skip to content

Commit 3936dc3

Browse files
authored
Merge pull request #1684 from oracle-devrel/ao-langchain-video-image-analysis
Ao langchain video image analysis
2 parents f199549 + b921592 commit 3936dc3

File tree

3 files changed

+233
-0
lines changed

3 files changed

+233
-0
lines changed
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Video Content PG Rating Analyzer
2+
3+
This is a Generative AI-powered application that analyzes frames from uploaded videos or images to determine their suitability for PG-rated audiences. The application leverages Oracle Cloud Infrastructure (OCI) Generative AI Vision models, specifically Llama 3.2 in this case, to evaluate visual content for explicit or age-inappropriate material. This can also be adapted as needed to extract specific elements, text (such as license plates) and other use-cases.
4+
5+
## Features
6+
- Upload image or video files (`.jpg`, `.png`, `.mp4`, `.avi`, `.mov`).
7+
- Automatically extract frames from videos at a user-defined interval.
8+
- Use OCI Generative AI to assess frame content for PG-appropriateness.
9+
- Highlight frames flagged as inappropriate with detailed reasons and timecodes.
10+
- Adjust AI confidence threshold and frame extraction interval. (Effects the prompt for confidence)
11+
12+
## Prerequisites
13+
Before running the application, ensure you have:
14+
- Python 3.8 or later installed
15+
- An active Oracle Cloud Infrastructure (OCI) account
16+
- Required Python dependencies installed
17+
- OCI Generative AI model name and compartment ID
18+
19+
## How It Works
20+
1. **Upload Media:**
21+
- Users upload a video or image file for analysis.
22+
2. **Frame Extraction:**
23+
- For videos, the app extracts frames at a selected interval.
24+
3. **AI Analysis:**
25+
- Each frame is encoded and sent to an OCI Vision model for analysis.
26+
- The AI responds with structured output indicating whether content is PG-appropriate.
27+
4. **Result Display:**
28+
- Inappropriate frames (based on confidence threshold) are displayed along with the reason and timecode.
29+
- A final PG-rating verdict is shown at the end.
30+
31+
## Example Output
32+
```json
33+
{
34+
"AgeAppropriate": "not-appropriate",
35+
"response": "Shows intense violence and blood spatter.",
36+
"ConfidenceLevel": 0.97
37+
}
38+
```
39+
40+
## Installation
41+
Clone this repository and navigate to the project directory:
42+
```bash
43+
git clone <repository-url>
44+
cd <repository-folder>
45+
```
46+
47+
Install the required dependencies:
48+
```bash
49+
pip install -r requirements.txt
50+
```
51+
52+
## Configuration
53+
To integrate with OCI Generative AI, update the following parameters in the code:
54+
```python
55+
llm = ChatOCIGenAI(
56+
model_id="Add your model name",
57+
compartment_id="Add your compartment ID",
58+
model_kwargs={"temperature": 0, "max_tokens": 2000},
59+
)
60+
```
61+
62+
Replace `model_id` and `compartment_id` with the appropriate values from your OCI console.
63+
64+
## Running the Application
65+
Run the Streamlit app using:
66+
```bash
67+
streamlit run <script-name>.py
68+
```
69+
70+
Replace `<script-name>.py` with the filename of your main script (e.g., `video_analyzer.py`).
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# ========================================
2+
# Imports
3+
# ========================================
4+
from langchain.chains.llm import LLMChain
5+
from langchain_core.prompts import PromptTemplate
6+
from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI
7+
from langchain.document_loaders import PyPDFLoader
8+
from langchain_core.messages import HumanMessage, SystemMessage
9+
from langchain.docstore.document import Document
10+
11+
import streamlit as st
12+
import io
13+
import base64
14+
import cv2
15+
import os
16+
import ast
17+
from datetime import timedelta
18+
19+
# ========================================
20+
# Helper Functions
21+
# ========================================
22+
23+
def encode_image(image_path):
24+
"""Encodes an image to base64 format for LLM input."""
25+
with open(image_path, "rb") as image_file:
26+
return base64.b64encode(image_file.read()).decode("utf-8")
27+
28+
29+
def extract_frames(video_path, interval, output_folder="frames"):
30+
"""
31+
Extracts frames from a video at a specified interval.
32+
33+
Args:
34+
video_path (str): Path to the video file.
35+
interval (int): Frame extraction interval.
36+
output_folder (str): Directory to store extracted frames.
37+
38+
Returns:
39+
list: Tuples containing frame file paths and corresponding timecodes.
40+
"""
41+
os.makedirs(output_folder, exist_ok=True)
42+
video_capture = cv2.VideoCapture(video_path)
43+
frame_count = 0
44+
extracted_frames = []
45+
frame_rate = int(video_capture.get(cv2.CAP_PROP_FPS))
46+
47+
while True:
48+
ret, frame = video_capture.read()
49+
if not ret:
50+
break
51+
52+
if frame_count % interval == 0:
53+
frame_path = os.path.join(output_folder, f"frame_{frame_count}.jpg")
54+
cv2.imwrite(frame_path, frame)
55+
timecode = str(timedelta(seconds=frame_count // frame_rate))
56+
extracted_frames.append((frame_path, timecode))
57+
58+
frame_count += 1
59+
60+
video_capture.release()
61+
return extracted_frames
62+
63+
# ========================================
64+
# Streamlit App UI and Logic
65+
# ========================================
66+
67+
def videoAnalyze():
68+
# Title of the app
69+
st.title("Analyze Images and Videos with OCI Generative AI")
70+
71+
# Sidebar inputs
72+
with st.sidebar:
73+
st.title("Parameters")
74+
st.selectbox("Output Language", ["English", "French"])
75+
confidenceThreshold = st.slider("Confidence Threshold", 0.0, 1.0)
76+
st.caption("Adjust the corresponding parameters to control the AI's responses and accuracy")
77+
interval = st.slider("Select the desired interval: ", 1, 48)
78+
79+
# Optional: Custom styling
80+
with open('style.css') as f:
81+
st.markdown(f'<style>{f.read()}</style>', unsafe_allow_html=True)
82+
83+
# File upload
84+
uploaded_file = st.file_uploader("Upload an image or video", type=["png", "jpg", "jpeg", "mp4", "avi", "mov"])
85+
user_prompt = st.text_input("Enter your prompt for analysis:", value="Is this frame suitable for PG-rated movies?")
86+
87+
if uploaded_file is not None:
88+
# Save the uploaded file locally
89+
temp_video_path = "temp_uploaded_video.mp4"
90+
with open(temp_video_path, "wb") as f:
91+
f.write(uploaded_file.getbuffer())
92+
93+
# Check if file is a video
94+
if uploaded_file.type.startswith("video"):
95+
# Extract frames at defined interval
96+
with st.spinner("Extracting frames from the video..."):
97+
frames_with_timecodes = extract_frames(temp_video_path, interval)
98+
st.success(f"Extracted {len(frames_with_timecodes)} frames for analysis.")
99+
100+
# Instantiate the OCI Generative AI Vision model
101+
llm = ChatOCIGenAI(
102+
model_id="meta.llama-3.2-90b-vision-instruct",
103+
compartment_id="", # <-- Add your compartment ID here
104+
model_kwargs={"max_tokens": 2000, "temperature": 0}
105+
)
106+
107+
# Loop through each frame for analysis
108+
violence_detected = False
109+
for frame_path, timecode in frames_with_timecodes:
110+
with st.spinner("Analyzing the frame..."):
111+
try:
112+
# Prepare the frame and messages
113+
encoded_frame = encode_image(frame_path)
114+
human_message = HumanMessage(
115+
content=[
116+
{"type": "text", "text": user_prompt},
117+
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_frame}"}},
118+
]
119+
)
120+
121+
system_message = SystemMessage(
122+
content="You are an expert in assessing the age-appropriateness of visual content. Your task is to analyze the provided image and provide a detailed assessment of its suitability for a PG-rated audience."
123+
"Respond only in dictionary format. Examples:\n"
124+
"If the frame contains elements unsuitable for a PG-rating: "
125+
"{'AgeAppropriate': 'not-appropriate', 'response': 'Brief description of the scene (e.g., shows graphic violence, explicit nudity).', 'ConfidenceLevel': 0.95}\n"
126+
"If the frame complies with PG-rating guidelines: "
127+
"{'AgeAppropriate': 'appropriate', 'response': 'Brief description of the scene (e.g., depicts a serene landscape, no concerning elements).', 'ConfidenceLevel': 0.90}\n"
128+
"Ensure your responses are concise and focused on the image's content. Avoid unnecessary details or conversations unrelated to the task."
129+
)
130+
131+
# LLM call
132+
ai_response = llm.invoke(input=[human_message, system_message])
133+
print(ai_response.content)
134+
response_dict = ast.literal_eval(ai_response.content)
135+
136+
# Parse and validate the response
137+
violence_status = response_dict.get("AgeAppropriate")
138+
detailed_response = response_dict.get("response")
139+
confidence = float(response_dict.get("ConfidenceLevel"))
140+
141+
# Display flagged frames
142+
if violence_status == "not-appropriate" and confidence >= confidenceThreshold:
143+
st.write(f"Frame Analysis: {detailed_response}")
144+
st.write(f"Timecode: {timecode}")
145+
st.image(frame_path, caption="Analyzing Frame", width=500)
146+
violence_detected = True
147+
148+
except Exception as e:
149+
print(f"Error analyzing frame: {str(e)}")
150+
151+
# Final result
152+
if violence_detected:
153+
st.warning("This movie is NOT PG Rated!")
154+
else:
155+
st.success("This movie is PG Rated!")
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
streamlit==1.32.2
2+
opencv-python==4.9.0.80
3+
langchain==0.1.13
4+
langchain-community==0.0.30
5+
langchain-core==0.1.32
6+
oracledb==1.4.0
7+
tiktoken==0.6.0
8+
pydantic==1.10.12

0 commit comments

Comments
 (0)