Skip to content

Commit dc6015e

Browse files
committed
Update architecture diagrams for hybrid Florence2+PaddleOCR flow
1 parent 4a266a5 commit dc6015e

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,12 @@ MCP-Server :5000 (FastAPI + JSON-RPC)
9595
├─ tool_registry_v2 / Schemas
9696
├─ Tool-Validierung (serverseitig)
9797
└─ Tools: Browser, Vision, OCR, Mouse, Search, File, Memory, Voice, ...
98+
|
99+
+--> florence2_hybrid_analysis (VisualNemotron v4 Vision-Pfad)
100+
| ├─ Florence-2: <CAPTION> + <OD> (UI-Elemente + BBoxes)
101+
| ├─ PaddleOCR (CPU): Text + BBoxes + Confidence
102+
| ├─ Merge: summary_prompt + ocr_backend status
103+
| └─ Nemotron Decision -> PyAutoGUI/MCP Action-Ausführung
98104
|
99105
+--> Externe Systeme: Desktop (PyAutoGUI), Browser (Playwright), APIs
100106
|
@@ -123,6 +129,13 @@ flowchart TD
123129
B --> M["MCP server 5000 json-rpc"]
124130
M --> TR["tool_registry_v2 and validation"]
125131
M --> T["tool modules"]
132+
T --> FH["florence2_hybrid_analysis"]
133+
FH --> FC["Florence-2 CAPTION + OD"]
134+
FH --> PO["PaddleOCR CPU text+bbox+conf"]
135+
FC --> FM["merge summary_prompt + ocr_backend"]
136+
PO --> FM
137+
FM --> ND["Nemotron decision"]
138+
ND --> PA["PyAutoGUI and MCP actions"]
126139
T --> E["desktop browser apis"]
127140
T --> MM["memory/memory_system.py"]
128141
MM --> IE["interaction events"]

0 commit comments

Comments
 (0)