Skip to content

Commit bf05a3b

Browse files
adileiadilei
andauthored
Add functional testing with DeepEval and AgentsSDK (#323)
* PytestAgentsSDK * Added pytest-html to requirements --------- Co-authored-by: adilei <[email protected]>
1 parent c8a9965 commit bf05a3b

File tree

11 files changed

+1489
-0
lines changed

11 files changed

+1489
-0
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Copilot Studio and MSAL configuration
2+
APP_CLIENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
3+
TENANT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
4+
ENVIRONMENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
5+
# This is the schema name of the agent, found under Settings → Advanced → Metadata
6+
AGENT_IDENTIFIER=cr26e_dMyAgent
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# PytestAgentsSDK
2+
3+
This project provides a sample test harness for evaluating Copilot Studio agents using [**Pytest**](https://docs.pytest.org/en/stable/) and [**DeepEval**](https://github.com/confident-ai/deepeval). It uses the [Microsoft 365 Agents SDK](https://github.com/microsoft/agents) to communicate with Copilot Studio and focuses on **semantic evaluation** of agent responses using DeepEval’s `GEval` metric.
4+
5+
## Features
6+
7+
- Multi-turn conversation testing against a Copilot Studio agent
8+
- Semantic response evaluation using DeepEval’s `GEval` metric
9+
- Loads test cases from a CSV file
10+
- Custom HTML reporting with detailed metadata (user input, actual and expected output, score, reason)
11+
- Authentication via MSAL, supporting [“Authenticate with Microsoft”](https://learn.microsoft.com/en-us/microsoft-copilot-studio/configuration-end-user-authentication#authenticate-with-microsoft) in Copilot Studio
12+
- Easily extensible for use with additional metrics and long-term result tracking using DeepEval and Pytest plugins
13+
14+
---
15+
16+
## Setup
17+
18+
### **1. Clone the repository**
19+
20+
```bash
21+
git clone https://github.com/microsoft/CopilotStudioSamples.git
22+
cd CopilotStudioSamples/FunctionalTesting/PytestAgentsSDK
23+
```
24+
25+
### **2. Create and activate a virtual environment**
26+
27+
```bash
28+
python3 -m venv .venv
29+
source .venv/bin/activate # On Windows use `.venv\Scripts\activate`
30+
```
31+
32+
### **3. Install required dependencies**
33+
34+
```bash
35+
pip install -r requirements.txt
36+
```
37+
38+
### **4. Create an app registration**
39+
40+
You will need to register an application in Azure for the SDK to authenticate with Copilot Studio:
41+
42+
- Create a **single-tenant** app registration in Azure
43+
- Under **Authentication → Platform configurations**, click **Add a platform**, and select **Mobile and desktop applications**
44+
- Add these redirect URIs:
45+
- `msal40347a26-35bb-48f3-bdc4-7f4f209aecb1://auth` (MSAL only)
46+
- `http://localhost`
47+
- Under **API permissions**, click **Add a permission**
48+
- Choose **APIs my organization uses**, then search for **Power Platform API**
49+
- Choose **Delegated permissions**, then add `CopilotStudio.Copilots.Invoke`
50+
51+
### **5. Authentication and Agent details**
52+
53+
Create a `.env` file (you can copy from `.env.template`) and populate it with your MSAL and Copilot Studio agent configuration:
54+
55+
```env
56+
APP_CLIENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
57+
TENANT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
58+
ENVIRONMENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
59+
AGENT_IDENTIFIER=cr26e_dMyAgent # This is the schema name, found under Settings > Advanced > Metadata
60+
```
61+
62+
### **6. Configure Azure OpenAI or OpenAI details**
63+
64+
You can use either OpenAI or Azure OpenAI with DeepEval.
65+
66+
#### To configure Azure OpenAI using the DeepEval CLI:
67+
68+
```bash
69+
deepeval set-azure-openai \
70+
--openai-endpoint=<endpoint> \ # e.g. https://example-resource.openai.azure.com/
71+
--openai-api-key=<api_key> \
72+
--openai-model-name=<model_name> \ # e.g. gpt-4o
73+
--deployment-name=<deployment_name> \ # e.g. Test Deployment
74+
--openai-api-version=<openai_api_version> # e.g. 2025-01-01-preview
75+
```
76+
77+
> These values will be stored in a local `.deepeval` configuration file.
78+
79+
Alternatively, if you're using OpenAI (not Azure), set the following environment variable:
80+
81+
```bash
82+
export OPENAI_API_KEY=<your-openai-key>
83+
```
84+
85+
### **7. Publish and set agent authentication**
86+
87+
Before running tests, ensure that your Copilot Studio agent is:
88+
89+
- **Published** in the Copilot Studio portal
90+
- Configured to use **[Authenticate with Microsoft](https://learn.microsoft.com/en-us/microsoft-copilot-studio/configuration-end-user-authentication#authenticate-with-microsoft)** under **Settings > Security > Authentication**
91+
92+
### **8. Prepare Test Cases (CSV Input)**
93+
94+
Before running the tests, populate the CSV file at `input/test_cases.csv` with your test cases.
95+
96+
The CSV file must contain two columns:
97+
98+
- `input_text`: The message sent to the Copilot Studio agent
99+
- `expected_output`: The ideal response you'd expect from the agent
100+
101+
#### Example:
102+
103+
```csv
104+
input_text,expected_output
105+
What is the capital of France?,The capital of France is Paris, which is known for its historical landmarks like the Eiffel Tower and the Louvre Museum.
106+
Who wrote 'Hamlet'?,William Shakespeare wrote the play 'Hamlet', which is considered one of the greatest works of English literature.
107+
What is the chemical symbol for water?,H3O is the correct chemical symbol for water.
108+
```
109+
110+
---
111+
112+
## Running the Tests
113+
114+
From the `PytestAgentsSDK` directory, run:
115+
116+
```bash
117+
pytest tests/multi_turn_eval_openai.py --html=reports/multi_turn_eval_openai.html --self-contained-html
118+
```
119+
120+
This will:
121+
122+
- Start a conversation with your Copilot Studio agent
123+
- Send test questions and capture responses
124+
- Evaluate the responses using DeepEval
125+
- Generate a self-contained HTML report in the `reports/` folder
126+
127+
---
128+
129+
## Output
130+
131+
The HTML report includes:
132+
133+
- Pass/Fail status based on semantic threshold
134+
- User message and expected answer
135+
- Actual response from the agent
136+
- DeepEval score
137+
- Explanation for the result
138+
- Conversation ID (for debugging)
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
input_text,expected_output
2+
What is the capital of France?,The capital of France is Paris, which is known for its historical landmarks like the Eiffel Tower and the Louvre Museum.
3+
Who wrote 'Hamlet'?,William Shakespeare wrote the play 'Hamlet', which is considered one of the greatest works of English literature.
4+
What is the chemical symbol for water?,H3O is the correct chemical symbol for water.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
[pytest]
2+
pythonpath = .
3+
testpaths = tests
4+
render_collapsed = True

0 commit comments

Comments
 (0)