Skip to content

Commit 2598c9b

Browse files
Merge branch 'main' into witold-swierzy-patch-8
2 parents 1160c3b + 19754f2 commit 2598c9b

File tree

540 files changed

+63048
-2522
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

540 files changed

+63048
-2522
lines changed

.DS_Store

-6 KB
Binary file not shown.

.gitignore

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,14 @@ override.tf.json
3333
terraform.rc
3434

3535
# Terraform lock files
36-
*.lock.hcl
36+
*.lock.hcl
37+
38+
# Apple Users DS_Store
39+
.DS_Store
40+
41+
#VSC files
42+
.vscode
43+
44+
# Exclude cached Python binary files
45+
*.pyc
46+
__pycache__

.vscode/extensions.json

Lines changed: 0 additions & 3 deletions
This file was deleted.

LICENSE renamed to LICENSE.txt

File renamed without changes.

Oracle Cloud Migration Service Template.md

Lines changed: 0 additions & 718 deletions
This file was deleted.

README.md

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,57 @@
1-
# Welcome to the Oracle Technology Specialists GitHub Repository
1+
# Technology Engineering GitHub Repository
22

3-
## Who are we?
3+
## Introduction
44

5-
We are a team of Oracle specialists focussing on technology cloud and software products. We answer questions, create demos and workshops, do hands-on guides, create code snippets and examples, and do health checks on existing solutions. We help our customers to find solutions to business challenges and to move workloads to Oracle cloud.
5+
### Who are we?
66

7-
## Why do we have a Git Repository?
7+
We are a team of Oracle specialists focusing on cloud and software products. We answer questions, create demos and workshops, provide hands-on guides, develop code snippets and examples, and perform health checks on existing solutions. We help our customers find solutions to business challenges and migrate workloads to the Oracle Cloud.
88

9-
We create all kinds of interesting assets in our line of work, which we believe should be open-source and accessible to everyone. We want to share our architecture patterns for solutions, technical step-by-step guides, and code examples. We believe this will simplify how we work internally in Oracle, while also sharing access to the same assets with our customers, implementation partners, and everyone interested. We want to be transparent on how we work and also grow the quality of our assets and best practices with the contribution coming from the community.
9+
### Why do we have a Git Repository?
1010

11-
## How to use this Repository?
11+
We create various interesting assets in our line of work, which we believe should be open-source and accessible to everyone. We want to share our architecture patterns for solutions, technical step-by-step guides, and code examples. We simplify how we work internally in Oracle, while also sharing access to the same assets with our customers, implementation partners, and everyone interested. We want to be transparent on how we work and also grow the quality of our assets and best practices, with contributions coming from the community.
12+
13+
## Installation
14+
15+
This is not a single software product repository, but a collection of various assets (code or not) in one single 'larger' repository. Thus, you won't 'install' this repository; rather, search it to find various interesting assets.
1216

1317
We structure our assets by our internal product areas and reflect this in the folder structure of this repository. The repository will have at least four levels of folders, starting with the first representing a wider product area (like Infrastructure Cloud), followed by a specific product area (like Compute), followed by a product or cloud service (like Bare Metal Compute), and finally by an asset for that product (like How-to-Guides).
1418

19+
As there is a lot of individual content within this repository and under a wider folder hierarchy, discovering that content is not as easy. Here's how you can find content or navigate the folder structure:
20+
21+
- Manually search via navigation through the folder hierarchy
22+
- Use the search feature of GitHub on the top right-hand side of the browser - it works better if you are logged into your GitHub account. You can search specific code languages and filter content that way. Also, you can search for keywords such as 'Workshop'. Lastly, if you enter the search from a subfolder, you can choose to only search in that folder, hopefully resulting in more useful search results.
23+
- Use direct links to share a specific asset or folder of the repository.
24+
25+
Some of our assets are code-based and can also be installed. There is always a README file within the asset folder, explaining the installation process.
26+
27+
## Documentation
28+
29+
As per the previous chapter on how to install, we have various assets in this one larger repository. If you find a code asset, it will have a README file for installation, documentation, and example purposes.
30+
31+
## Examples
32+
33+
As per the previous chapter on how to install, we have various assets in this one larger repository. If you find a code asset, it will have a README file for installation, documentation, and example purposes.
34+
35+
See the bullet points from the chapter *Installation* list as an example on how to navigate the repository.
36+
37+
## Help
38+
39+
If you find an error, can't find what you were looking for, or would like to suggest a new asset, please use the Issues feature of GitHub (Account login required).
40+
41+
## Contributing
42+
43+
This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md).
44+
45+
## Security
46+
47+
Please consult the [security guide](./SECURITY.md) for our responsible security vulnerability disclosure process.
48+
1549
## License
50+
1651
Copyright (c) 2025 Oracle and/or its affiliates.
1752

1853
Licensed under the Universal Permissive License (UPL), Version 1.0.
1954

2055
See [LICENSE](LICENSE) for more details.
2156

22-
ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.
57+
ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.  FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# MCP Document Understanding Invoice Agent
2+
3+
The **Document Understanding Agent** is an AI-powered assistant designed to extract and understand text from documents (e.g., PDFs, images) using Oracle Cloud Infrastructure (OCI) Generative AI Agents and Document Understanding services.
4+
5+
This tool demonstrates an end-to-end workflow involving:
6+
7+
- File upload (via React frontend)
8+
- File storage in OCI Object Storage
9+
- Text extraction with OCI Document Understanding
10+
- Summary and reasoning via a GenAI Agent powered by MCP (Model Context Protocol)
11+
12+
The architecture is modular and can be easily extended by adding tools directly from the OCI Console, such as a RAG (Retrieval-Augmented Generation) tool or any other custom MCP-compatible tool, enabling more advanced workflows beyond document extraction—such as contextual question answering, validation, enrichment, or classification
13+
14+
---
15+
16+
## When to use this asset?
17+
18+
Use this assistant when you want to:
19+
20+
- Automatically extract text from scanned documents (images, PDFs)
21+
- Invoke OCI Document Understanding tools through an AI agent
22+
- Demonstrate AI-based document orchestration on OCI and Validation
23+
24+
### Ideal for:
25+
26+
- AI developers building document understanding pipelines
27+
- Oracle Cloud users integrating Generative AI with Object Storage
28+
- Showing document AI capabilities
29+
30+
---
31+
32+
## How to use this asset?
33+
34+
### Start the Backend
35+
36+
Navigate to the backend directory and run:
37+
38+
```bash
39+
cd backend
40+
python mcp_server_docunderstandingobjectextract.py
41+
(In a different terminal)
42+
uvicorn apiserverdocunderstandingobjectextract:app --reload --port 8001
43+
```
44+
45+
This does the following:
46+
47+
- Starts a local MCP server with a tool (`ocr_extract_from_object_storage2`) that wraps OCI Document Understanding.
48+
- Starts a FastAPI server that handles file uploads and routes them to Object Storage + the agent.
49+
50+
### Start the Frontend
51+
52+
In a separate terminal:
53+
54+
```bash
55+
cd oci-genai-agent-llama-react-frontend
56+
npm install
57+
npm run dev
58+
```
59+
60+
You will see a chat interface at [http://localhost:3000](http://localhost:3000), with support for file uploads (PDF, PNG, etc).
61+
62+
When you send a file:
63+
64+
- It's shown as a preview
65+
- Uploaded to the backend
66+
- Saved in OCI Object Storage
67+
- Routed to the GenAI Agent with an instruction like:
68+
69+
```
70+
Extract text from object storage. Namespace: <namespace>, Bucket: <bucket>, Name: <filename>
71+
```
72+
73+
---
74+
75+
## ⚙️ Setup Instructions
76+
77+
### 1. OCI Config
78+
79+
Set the following in `~/.oci/config`:
80+
81+
```ini
82+
[DEFAULT]
83+
user=ocid1.user.oc1..exampleuniqueID
84+
fingerprint=xx:xx:xx:xx
85+
key_file=~/.oci/oci_api_key.pem
86+
tenancy=ocid1.tenancy.oc1..exampleuniqueID
87+
region=us-chicago-1
88+
```
89+
90+
### 2. Object Storage Setup
91+
92+
- Create a bucket (e.g., `bucket-20250714-1419`)
93+
- Make sure the user has permission to `put_object` and `get_namespace`
94+
95+
### 3. MCP Tooling
96+
97+
- `mcp_server_docunderstandingobjectextract.py` exposes a tool `ocr_extract_from_object_storage2`
98+
- This is picked up by the agent automatically on FastAPI startup
99+
100+
---
101+
102+
## ✨ Key Features
103+
104+
| Feature | Description |
105+
| ---------------------- | ----------------------------------------------------------------- |
106+
| File Upload | Upload images or PDFs through the React UI |
107+
| OCI Object Storage | All files are stored securely in your OCI bucket |
108+
| OCR Extraction | Uses OCI Document Understanding to extract text from scanned docs |
109+
| GenAI Agent | Routes and responds to user requests intelligently |
110+
| Tool Orchestration | Agent can invoke tools dynamically (via MCP) |
111+
| Natural Language Reply | AI explains extracted results in human-readable format |
112+
113+
---
114+
115+
## Prompt Customization
116+
117+
The main agent prompt is set as:
118+
119+
```
120+
If the user wants to extract text from a document in Object Storage,
121+
call the `ocr_extract_from_object_storage2` tool with the `namespace`, `bucket`, and `name`.
122+
```
123+
124+
You can modify this in `apiserverdocunderstandingobjectextract.py` under `Agent(...)` setup.
125+
126+
---
127+
128+
## Useful for:
129+
130+
- Demos of OCI Document Understanding + GenAI Agents
131+
- Building your own document processing pipeline
132+
- AI chatbots that take file input and analyze content
133+
134+
---
135+
136+
## Directory Structure
137+
138+
```bash
139+
backend/
140+
├── apiserverdocunderstandingobjectextract.py # FastAPI app
141+
├── mcp_server_docunderstandingobjectextract.py # MCP server with document OCR tool
142+
143+
oci-genai-agent-llama-react-frontend/
144+
├── src/
145+
│ └── app/
146+
│ └── contexts/
147+
│ └── ChatContext.js # Hooks into backend API
148+
```
149+
150+
---
151+
152+
## License
153+
154+
Copyright (c) 2025 Oracle and/or its affiliates.
155+
Licensed under the Universal Permissive License (UPL), Version 1.0.
156+
157+
See LICENSE for more details.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# backend/apiserver.py
2+
import logging
3+
from fastapi import FastAPI, UploadFile, File, Form
4+
from fastapi.middleware.cors import CORSMiddleware
5+
from fastapi.responses import JSONResponse
6+
from rich.logging import RichHandler
7+
from rich.markup import escape
8+
9+
import oci
10+
from mcp.client.session_group import StreamableHttpParameters
11+
from oci.addons.adk import Agent, AgentClient
12+
from oci.addons.adk.mcp import MCPClientStreamableHttp
13+
14+
# — Logging —
15+
logging.basicConfig(level=logging.INFO, format="%(message)s", handlers=[RichHandler()])
16+
logger = logging.getLogger(__name__)
17+
18+
# — FastAPI Setup —
19+
app = FastAPI()
20+
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])
21+
22+
BUCKET_NAME = "bucket-20250714-1419"
23+
24+
@app.on_event("startup")
25+
async def startup_event():
26+
logger.info("Connecting to MCP…")
27+
mcp_params = StreamableHttpParameters(url="http://localhost:8000/mcp")
28+
mcp_client = await MCPClientStreamableHttp(params=mcp_params, name="Smart Toolbox MCP").__aenter__()
29+
30+
client = AgentClient(auth_type="api_key", debug=False, region="us-chicago-1")
31+
agent = Agent(
32+
client=client,
33+
agent_endpoint_id="ocid1.genaiagentendpoint",
34+
instructions=(
35+
"If the user wants to extract text from a document in Object Storage, "
36+
"call the `ocr_extract_from_object_storage2` tool with the `namespace`, `bucket`, and `name`."
37+
38+
),
39+
tools=[await mcp_client.as_toolkit()],
40+
)
41+
agent.setup()
42+
app.state.agent = agent
43+
logger.info(" Agent ready.")
44+
45+
@app.post("/chat")
46+
async def chat(message: str = Form(...), file: UploadFile = File(None)):
47+
if file:
48+
try:
49+
file_bytes = await file.read()
50+
file_name = file.filename
51+
52+
config = oci.config.from_file()
53+
obj_client = oci.object_storage.ObjectStorageClient(config)
54+
namespace = obj_client.get_namespace().data
55+
56+
# Upload file
57+
obj_client.put_object(namespace, BUCKET_NAME, file_name, file_bytes)
58+
logger.info(f" Uploaded {file_name} to {BUCKET_NAME}")
59+
60+
# Inject the correct call instruction to the agent
61+
message = f"Extract text from object storage. Namespace: {namespace}, Bucket: {BUCKET_NAME}, Name: {file_name}"
62+
63+
except Exception as e:
64+
logger.exception(" Upload failed")
65+
return JSONResponse({"error": f"Upload failed: {str(e)}"}, status_code=500)
66+
67+
try:
68+
result = await app.state.agent.run_async(message)
69+
70+
if isinstance(result.output, dict) and result.output.get("type") == "function":
71+
name = result.output["name"]
72+
args = result.output["parameters"]
73+
logger.info(" Agent calls tool: %s(%r)", name, args)
74+
75+
tool_out = await app.state.agent.invoke_tool(name, args)
76+
followup = await app.state.agent.run_async({"type": "tool", "name": name, "output": tool_out})
77+
out = followup.output if not isinstance(followup.output, dict) else followup.output.get("text", "")
78+
else:
79+
out = result.output if not isinstance(result.output, dict) else result.output.get("text", "")
80+
81+
logger.info(" Replying: %s", escape(out))
82+
return JSONResponse({"text": out})
83+
84+
except Exception:
85+
logger.exception(" Chat error")
86+
return JSONResponse({"error": "internal error"}, status_code=500)

0 commit comments

Comments
 (0)