Skip to content

Commit 7ff67e5

Browse files
authored
Docs/fix chatbot cms documentation (#582)
* Fix documentation inaccuracies and clean up dev dependencies - chatbot-system.md: correct state namespaces (system/user/context/temp), replace JSON Logic with CEL syntax, fix sync execution model description, fix processor hierarchy, add /v1 prefix to endpoints, update schema - cms.md: remove fictional webhook registration endpoints, add missing schema fields (school_id, visibility, search_document), condense stale Landbot migration references - architecture-service-layer.md: refactor from 2294 to 287 lines, separate current implementation from aspirational design, verify all service and repository inventories against code - pyproject.toml: add psutil to dev deps, remove unused playwright/ pytest-playwright * Deduplicate docs, add unit tests, and create design notes Documentation: - CLAUDE.md: deduplicate with README (216 to 108 lines), keep AI-specific instructions, reference README for shared content - README.md: absorb unique content from CLAUDE.md, add LOCAL_BUILD_ONLY flag, document @pytest.mark.isolated, reference setup-test-env.sh - docs/cms.md: annotate analytics endpoints as real SQL vs placeholder, trim verbose response examples (-313 lines) - docs/design-session-replay.md: refactor from 886 to 104 lines, verify all components against code, remove pseudo-code for implemented features - docs/testing-credentials.md: consolidate on seed script as primary approach, fix OpenAPI URLs (/docs to /v1/docs), fix absolute paths - docs/architecture-roadmap.md: new file preserving design thinking cut from architecture refactor (UoW, CQRS, migration strategy, events) - docs/analytics-design-note.md: new design note categorizing 21 analytics endpoints (8 real SQL, 5 hardcoded fake, 4 deferred, 5 no backend) Tests: - app/tests/unit/test_cms_workflow.py: 66 new unit tests covering publish, bulk ops, validation, variants, content lifecycle - app/tests/unit/test_chatbot_integrations.py: 49 new unit tests covering helper functions and Pydantic models * Remove legacy setup_test_environment.py and its documentation The seed script (scripts/seed_admin_ui_data.py) is the canonical test data setup. The legacy Playwright-focused script created a different school with fewer roles and is no longer needed. * Restructure Huey Bookbot flow into composite sub-flows with bug fixes Split the flat 24-node flow into a hierarchical architecture using CompositeNodeProcessor: main flow (10 nodes) orchestrates three sub-flows for profile collection, preference discovery, and book recommendation. Bug fixes: - Strip unresolved {{...}} templates to None in API call bodies to prevent Pydantic validation errors (e.g. Optional[UUID] rejecting literal strings) - Use fallback_response when internal API handlers fail - Preserve session_flow_id across sub-flow transitions (was resetting to None) - Handle dict question results from composite nodes in parent flow return - Extract messages correctly from sub-flow return results - Refresh flow_stack from session after composite processing to catch new entries - Build input_request and persist options for questions in parent flow return * Address code review: fix template stripping, sub_flow_id bug, add tests Fixes from code review: - Use regex for _strip_unresolved_templates instead of naive string check, add debug logging when stripping occurs - Apply template stripping to query_params (was only on body) - Use exc_info=True for structured exception logging in fallback handler - Fix sub_flow_id sourcing: track source_result through the processing loop so dict questions get flow_id from the correct result - Extract _to_uuid helper to deduplicate UUID conversion (3 call sites) Tests (30 new, 443 total): - test_action_processor: _strip_unresolved_templates edge cases (12 tests), fallback_response mechanism (3 tests) - test_chat_runtime_subflow: _to_uuid (4), sanitize_user_input (4), _try_return_to_parent_flow logic paths (7) including message extraction, question node resolution, dict question flow_id, and source_result tracking * Fix unused variable lint error in exception handler
1 parent f861fce commit 7ff67e5

12 files changed

+1489
-21
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ archive/
3434
# Temporary debug/test scripts
3535
debug_*.py
3636
test_*.py
37+
!app/tests/**/test_*.py
3738
fix_*.py
3839
activate_*.py
3940
get_*_token.py
@@ -53,6 +54,7 @@ e2e_tests/
5354
*.json
5455
!pyproject.toml
5556
!package.json
57+
!scripts/fixtures/*.json
5658
landbot_extraction_output/
5759

5860
# Credentials (should never be committed)

app/services/action_processor.py

Lines changed: 37 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
"""
1111

1212
import datetime
13+
import re
1314
from typing import Any, Dict
1415

1516
from sqlalchemy.ext.asyncio import AsyncSession
@@ -29,6 +30,25 @@
2930
logger = get_logger()
3031

3132

33+
_TEMPLATE_RE = re.compile(r"\{\{.*?\}\}")
34+
35+
36+
def _strip_unresolved_templates(obj: Any) -> Any:
37+
"""Replace unresolved ``{{...}}`` template strings with ``None``.
38+
39+
Only matches actual template syntax (``{{var}}``) — stray ``{{`` or ``}}``
40+
in isolation are left untouched. Recursively processes dicts and lists.
41+
"""
42+
if isinstance(obj, str) and _TEMPLATE_RE.search(obj):
43+
logger.debug("Stripping unresolved template", value=obj)
44+
return None
45+
elif isinstance(obj, dict):
46+
return {k: _strip_unresolved_templates(v) for k, v in obj.items()}
47+
elif isinstance(obj, list):
48+
return [_strip_unresolved_templates(v) for v in obj]
49+
return obj
50+
51+
3252
def _extract_nested(data: Dict[str, Any], key_path: str) -> Any:
3353
"""Extract a value from a nested dict using dot notation."""
3454
keys = key_path.split(".")
@@ -319,13 +339,27 @@ async def _handle_api_call(
319339
resolved_body = self.runtime.substitute_object(
320340
api_config_data.get("body", {}), state
321341
)
342+
resolved_body = _strip_unresolved_templates(resolved_body)
322343
resolved_params = self.runtime.substitute_object(
323344
api_config_data.get("query_params", {}), state
324345
)
346+
resolved_params = _strip_unresolved_templates(resolved_params)
325347

326-
result_data = await INTERNAL_HANDLERS[endpoint](
327-
db, resolved_body, resolved_params
328-
)
348+
try:
349+
result_data = await INTERNAL_HANDLERS[endpoint](
350+
db, resolved_body, resolved_params
351+
)
352+
except Exception:
353+
logger.error(
354+
"Internal handler failed",
355+
endpoint=endpoint,
356+
exc_info=True,
357+
)
358+
fallback = api_config_data.get("fallback_response")
359+
if fallback is not None:
360+
result_data = fallback
361+
else:
362+
raise
329363

330364
response_mapping = api_config_data.get("response_mapping", {})
331365
for response_path, variable_name in response_mapping.items():

app/services/chat_runtime.py

Lines changed: 59 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,13 @@ class FlowNotFoundError(Exception):
3232
pass
3333

3434

35+
def _to_uuid(value: Any) -> Optional[UUID]:
36+
"""Convert a value to UUID, accepting both str and UUID inputs."""
37+
if value is None:
38+
return None
39+
return UUID(value) if isinstance(value, str) else value
40+
41+
3542
def sanitize_user_input(user_input: str) -> str:
3643
"""Sanitize user input to prevent XSS attacks.
3744
@@ -1212,7 +1219,7 @@ async def process_interaction(
12121219
# If the processed node (e.g., message) has a next_node that's a question,
12131220
# position the session at that question node so user input goes there
12141221
session_position = response["next_node"].node_id
1215-
session_flow_id: Optional[UUID] = None
1222+
session_flow_id: Optional[UUID] = response["next_node"].flow_id
12161223
chained_next_node = (
12171224
next_result.get("next_node") if next_result else None
12181225
)
@@ -1230,9 +1237,13 @@ async def process_interaction(
12301237
isinstance(next_result, dict)
12311238
and next_result.get("type") == "question"
12321239
):
1233-
# Action/condition node auto-chained into a question
1240+
# Action/condition/composite node auto-chained into a question
12341241
awaiting_input = True
12351242
session_position = next_result.get("node_id", session_position)
1243+
# Preserve sub-flow context when a composite node returns a question directly
1244+
sub_flow_id = _to_uuid(next_result.get("sub_flow_id"))
1245+
if sub_flow_id:
1246+
session_flow_id = sub_flow_id
12361247
result["input_request"] = self._build_input_request(next_result)
12371248
if chained_next_node:
12381249
# Handle FlowNode objects
@@ -1255,13 +1266,9 @@ async def process_interaction(
12551266
if node_id:
12561267
session_position = node_id
12571268
# Get sub_flow_id from the parent result (composite node)
1258-
sub_flow_id = next_result.get("sub_flow_id")
1269+
sub_flow_id = _to_uuid(next_result.get("sub_flow_id"))
12591270
if sub_flow_id:
1260-
session_flow_id = (
1261-
UUID(sub_flow_id)
1262-
if isinstance(sub_flow_id, str)
1263-
else sub_flow_id
1264-
)
1271+
session_flow_id = sub_flow_id
12651272
awaiting_input = True
12661273
result["input_request"] = self._build_input_request(
12671274
chained_next_node
@@ -1644,21 +1651,32 @@ async def _try_return_to_parent_flow(
16441651
session = await self._refresh_session(db, session)
16451652

16461653
if return_result:
1654+
# Extract actual messages from the result
16471655
if return_result.get("type") == "messages":
1648-
result["messages"].append(return_result)
1656+
result["messages"].extend(return_result.get("messages", []))
16491657
else:
16501658
result["messages"].append(return_result)
16511659

1652-
# Continue processing chain until question or end
1660+
# Continue processing chain until question or end.
1661+
# Track the result that produced next_node so we can
1662+
# extract sub_flow_id from the correct source.
16531663
next_node = return_result.get("next_node")
1664+
source_result = return_result
16541665
while next_node:
16551666
if isinstance(next_node, FlowNode):
16561667
if next_node.node_type == NodeType.QUESTION:
1657-
# Update session position to question
1668+
# Build input_request and persist options
1669+
(
1670+
result["input_request"],
1671+
q_options,
1672+
session,
1673+
) = await self._resolve_question_node(db, next_node, session)
1674+
q_state = {"system": {"_current_options": q_options}}
1675+
session = await self._refresh_session(db, session)
16581676
session = await chat_repo.update_session_state(
16591677
db,
16601678
session_id=session.id,
1661-
state_updates={},
1679+
state_updates=q_state,
16621680
current_node_id=next_node.node_id,
16631681
current_flow_id=next_node.flow_id,
16641682
expected_revision=session.revision,
@@ -1675,19 +1693,46 @@ async def _try_return_to_parent_flow(
16751693
session = await self._refresh_session(db, session)
16761694
if node_result:
16771695
result["messages"].append(node_result)
1696+
source_result = node_result
16781697
next_node = (
16791698
node_result.get("next_node") if node_result else None
16801699
)
16811700
else:
16821701
break
1702+
elif (
1703+
isinstance(next_node, dict) and next_node.get("type") == "question"
1704+
):
1705+
# Dict question result from composite node sub-flow
1706+
node_id = next_node.get("node_id")
1707+
flow_id = _to_uuid(
1708+
source_result.get("sub_flow_id") if source_result else None
1709+
)
1710+
if node_id:
1711+
result["current_node_id"] = node_id
1712+
options = next_node.get("options", [])
1713+
session = await chat_repo.update_session_state(
1714+
db,
1715+
session_id=session.id,
1716+
state_updates={"system": {"_current_options": options}},
1717+
current_node_id=node_id,
1718+
current_flow_id=flow_id,
1719+
expected_revision=session.revision,
1720+
)
1721+
result["input_request"] = self._build_input_request(next_node)
1722+
result["awaiting_input"] = True
1723+
break
16831724
else:
16841725
break
16851726

16861727
# If no next node after processing, check for another parent level
16871728
if not next_node and not result.get("awaiting_input"):
16881729
result["session_ended"] = True
1689-
# Recursively check for more parent flows
1690-
if flow_stack:
1730+
# Re-read session flow_stack: the return node may have been a
1731+
# composite that pushed new entries (e.g., recommendation sub-flow
1732+
# ran entirely within CompositeNodeProcessor)
1733+
session = await self._refresh_session(db, session)
1734+
current_stack = session.info.get("flow_stack", [])
1735+
if current_stack:
16911736
return await self._try_return_to_parent_flow(db, session, result)
16921737

16931738
return result

0 commit comments

Comments
 (0)