Skip to content

Commit 577a764

Browse files
committed
Add high-level query() method. Update message index in add_messages_with_indexing
1 parent a74c94d commit 577a764

File tree

6 files changed

+405
-4
lines changed

6 files changed

+405
-4
lines changed

TADA.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Talk at PyBay is on Sat, Oct 18 in SF
44

55
## Software
66

7-
- Design and implement high-level API to support ingestion and querying
7+
- Rename utool.py to query.py
88
- Unify Podcast and VTT ingestion (use shared message and metadata classes)?
99
- Code structure (do podcasts and transcripts need to be under typeagent?)?
1010
- Distinguish between release deps and build/dev deps?
@@ -33,11 +33,10 @@ Talk at PyBay is on Sat, Oct 18 in SF
3333

3434
- Getting Started
3535
- Document the high-level API
36-
- Document the MCP API [NOT YET]
3736
- Document what should go in `.env` and where it should live
3837
- And alternatively (first?) what to put in shell env directly
3938
- Document test/build/release process
40-
- Document how to run evals (but don't reveal all the data)
39+
- Document how to run evaluations (but don't reveal all the data)
4140

4241
## Demos
4342

docs/query-method.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Conversation Query Method
2+
3+
The `query()` method provides a simple, end-to-end API for querying conversations using natural language.
4+
5+
## Usage
6+
7+
```python
8+
from typeagent import create_conversation
9+
from typeagent.transcripts.transcript import TranscriptMessage
10+
11+
# Create a conversation
12+
conv = await create_conversation(
13+
"my_conversation.db",
14+
TranscriptMessage,
15+
name="My Conversation",
16+
)
17+
18+
# Add messages
19+
messages: list[TranscriptMessage] = [...]
20+
await conv.add_messages_with_indexing(messages)
21+
22+
# Query the conversation
23+
question: str = input("typeagent> ")
24+
answer: str = await conv.query(question)
25+
print(answer)
26+
```
27+
28+
## How It Works
29+
30+
The `query()` method encapsulates the full TypeAgent query pipeline:
31+
32+
1. **Natural Language Understanding**: Uses TypeChat to translate the natural language question into a structured search query
33+
2. **Search**: Executes the search across the conversation's messages and knowledge base
34+
3. **Answer Generation**: Uses an LLM to generate a natural language answer based on the search results
35+
36+
## Method Signature
37+
38+
```python
39+
async def query(self, question: str) -> str:
40+
"""
41+
Run an end-to-end query on the conversation.
42+
43+
Args:
44+
question: The natural language question to answer
45+
46+
Returns:
47+
A natural language answer string. If the answer cannot be determined,
48+
returns an explanation of why no answer was found.
49+
"""
50+
```
51+
52+
## Behavior
53+
54+
- **Success**: Returns a natural language answer synthesized from the conversation content
55+
- **No Answer Found**: Returns a message explaining why the answer couldn't be determined
56+
- **Search Failure**: Returns an error message describing the failure
57+
58+
## Performance Considerations
59+
60+
The `query()` method caches the TypeChat translators per conversation instance, so repeated queries on the same conversation are more efficient.
61+
62+
## Example: Interactive Loop
63+
64+
```python
65+
while True:
66+
question: str = input("typeagent> ")
67+
if not question.strip():
68+
continue
69+
if question.lower() in ("quit", "exit"):
70+
break
71+
72+
answer: str = await conv.query(question)
73+
print(answer)
74+
```
75+
76+
## Example: Batch Processing
77+
78+
```python
79+
questions = [
80+
"What was discussed?",
81+
"Who were the speakers?",
82+
"What topics came up?",
83+
]
84+
85+
for question in questions:
86+
answer = await conv.query(question)
87+
print(f"Q: {question}")
88+
print(f"A: {answer}")
89+
print()
90+
```
91+
92+
## Related APIs
93+
94+
For more control over the query pipeline, you can use the lower-level APIs:
95+
96+
- `searchlang.search_conversation_with_language()` - Search only
97+
- `answers.generate_answers()` - Answer generation from search results
98+
99+
See `tools/utool.py` for examples of using these lower-level APIs with debugging options.

examples/simple_query_demo.py

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
#!/usr/bin/env python3
2+
# Copyright (c) Microsoft Corporation.
3+
# Licensed under the MIT License.
4+
5+
"""
6+
Simple demo of the conversation.query() method.
7+
8+
This demonstrates the end-to-end query pattern:
9+
question = input("typeagent> ")
10+
answer = await conv.query(question)
11+
print(answer)
12+
"""
13+
14+
import asyncio
15+
16+
from typeagent import create_conversation
17+
from typeagent.aitools.embeddings import AsyncEmbeddingModel
18+
from typeagent.aitools.utils import load_dotenv
19+
from typeagent.knowpro.convsettings import ConversationSettings
20+
from typeagent.transcripts.transcript import TranscriptMessage, TranscriptMessageMeta
21+
22+
23+
async def main():
24+
"""Demo the simple query API."""
25+
# Load API keys
26+
load_dotenv()
27+
28+
# Create a conversation with some sample content
29+
print("Creating conversation...")
30+
conv = await create_conversation(
31+
None,
32+
TranscriptMessage,
33+
name="Demo Conversation",
34+
)
35+
36+
# Add some sample messages
37+
messages = [
38+
TranscriptMessage(
39+
text_chunks=["Welcome to the Python programming tutorial."],
40+
metadata=TranscriptMessageMeta(speaker="Instructor"),
41+
),
42+
TranscriptMessage(
43+
text_chunks=["Today we'll learn about async/await in Python."],
44+
metadata=TranscriptMessageMeta(speaker="Instructor"),
45+
),
46+
TranscriptMessage(
47+
text_chunks=[
48+
"Python is a great language for beginners and experts alike."
49+
],
50+
metadata=TranscriptMessageMeta(speaker="Instructor"),
51+
),
52+
TranscriptMessage(
53+
text_chunks=["The async keyword is used to define asynchronous functions."],
54+
metadata=TranscriptMessageMeta(speaker="Instructor"),
55+
),
56+
TranscriptMessage(
57+
text_chunks=[
58+
"You use await to wait for asynchronous operations to complete."
59+
],
60+
metadata=TranscriptMessageMeta(speaker="Instructor"),
61+
),
62+
]
63+
64+
print("Adding messages and building indexes...")
65+
result = await conv.add_messages_with_indexing(messages)
66+
print(f"Conversation ready with {await conv.messages.size()} messages.")
67+
print(f"Added {result.messages_added} messages, {result.semrefs_added} semantic refs")
68+
69+
# Check indexes
70+
if conv.secondary_indexes:
71+
if conv.secondary_indexes.message_index:
72+
msg_index_size = await conv.secondary_indexes.message_index.size()
73+
print(f"Message index has {msg_index_size} entries")
74+
print()
75+
76+
# Interactive query loop
77+
print("You can now ask questions about the conversation.")
78+
print("Type 'quit' or 'exit' to stop.\n")
79+
80+
while True:
81+
try:
82+
question: str = input("typeagent> ")
83+
if not question.strip():
84+
continue
85+
if question.strip().lower() in ("quit", "exit", "q"):
86+
break
87+
88+
# This is the simple API pattern
89+
answer: str = await conv.query(question)
90+
print(answer)
91+
print()
92+
93+
except EOFError:
94+
print()
95+
break
96+
except KeyboardInterrupt:
97+
print("\nExiting...")
98+
break
99+
100+
101+
if __name__ == "__main__":
102+
asyncio.run(main())

test/test_query_method.py

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Copyright (c) Microsoft Corporation.
2+
# Licensed under the MIT License.
3+
4+
"""Test the conversation.query() method."""
5+
6+
import pytest
7+
8+
from typeagent import create_conversation
9+
from typeagent.aitools.embeddings import AsyncEmbeddingModel, TEST_MODEL_NAME
10+
from typeagent.aitools.utils import load_dotenv
11+
from typeagent.knowpro.convsettings import ConversationSettings
12+
from typeagent.transcripts.transcript import TranscriptMessage, TranscriptMessageMeta
13+
14+
15+
@pytest.fixture(scope="session")
16+
def needs_auth() -> None:
17+
"""Load environment variables for authentication."""
18+
load_dotenv()
19+
20+
21+
@pytest.mark.asyncio
22+
async def test_query_method_basic(needs_auth: None):
23+
"""Test the basic query method workflow."""
24+
# Create a conversation with some test data
25+
test_model = AsyncEmbeddingModel(model_name=TEST_MODEL_NAME)
26+
settings = ConversationSettings(model=test_model)
27+
conversation = await create_conversation(
28+
None,
29+
TranscriptMessage,
30+
name="Test Conversation",
31+
settings=settings,
32+
)
33+
34+
# Add some test messages
35+
messages = [
36+
TranscriptMessage(
37+
text_chunks=["Welcome to the Python programming tutorial."],
38+
metadata=TranscriptMessageMeta(speaker="Instructor"),
39+
),
40+
TranscriptMessage(
41+
text_chunks=["Today we'll learn about async/await in Python."],
42+
metadata=TranscriptMessageMeta(speaker="Instructor"),
43+
),
44+
TranscriptMessage(
45+
text_chunks=["Python is a great language for beginners."],
46+
metadata=TranscriptMessageMeta(speaker="Instructor"),
47+
),
48+
]
49+
50+
await conversation.add_messages_with_indexing(messages)
51+
52+
# Test the query method
53+
answer = await conversation.query("What programming language is discussed?")
54+
55+
# Verify we got a response (content depends on indexing and LLM behavior)
56+
assert isinstance(answer, str)
57+
assert len(answer) > 0
58+
# The answer should either mention Python or indicate no answer was found
59+
# Both are valid since indexing might not extract all knowledge
60+
assert (
61+
"python" in answer.lower()
62+
or "no answer" in answer.lower()
63+
or "unable to find" in answer.lower()
64+
)
65+
66+
67+
@pytest.mark.asyncio
68+
async def test_query_method_empty_conversation(needs_auth: None):
69+
"""Test query method on an empty conversation."""
70+
test_model = AsyncEmbeddingModel(model_name=TEST_MODEL_NAME)
71+
settings = ConversationSettings(model=test_model)
72+
conversation = await create_conversation(
73+
None,
74+
TranscriptMessage,
75+
name="Empty Conversation",
76+
settings=settings,
77+
)
78+
79+
# Query should handle empty conversation gracefully
80+
answer = await conversation.query("What was discussed?")
81+
82+
assert isinstance(answer, str)
83+
assert len(answer) > 0
84+
# Should indicate no answer found or no relevant information
85+
assert "no answer" in answer.lower() or "unable to find" in answer.lower()

0 commit comments

Comments
 (0)