Stabilize disabled-tool real API CI test by cursor[bot] · Pull Request #587 · mattermost/mattermost-plugin-agents

cursor · 2026-04-01T18:16:58Z

Summary

The failing CI check was e2e-tool-config-real-apis. After reviewing the failed job log for job 69552028161, the actual failure was not the Node 20 deprecation warning in the summary output; it was a flaky assertion in e2e/tests/tool-config/real-api/disabled-tool.spec.ts.

This PR updates that real-API Playwright test to assert the behavior we actually need to guarantee:

read_post is filtered out of the embedded tool list
the disabled Read Post tool does not surface as a tool invocation in the RHS

It no longer fails when a real provider answers by using other still-enabled tools (for example channel-history retrieval), which is a valid outcome and was causing the OpenAI variant to fail intermittently.

Validation:

Reviewed failed CI log with gh run view --job 69552028161 --log-failed --repo mattermost/mattermost-plugin-agents
Confirmed there was no existing open PR for branch cursor/ci-pipeline-failure-c8fa
cd e2e && npx playwright test tests/tool-config/real-api/disabled-tool.spec.ts --project=chromium --list

Ticket Link

None

Screenshots

None

Release Note

NONE

Co-authored-by: Christopher Speller <crspeller@users.noreply.github.com>

github-actions · 2026-04-01T18:19:00Z

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 18/19 tests passed (94.7%)

Provider	Total	Passed	Failed	Pass Rate
⚠️ OPENAI	19	18	1	94.7%

❌ Failed Evaluations

Show 1 failures

OPENAI

1. TestReactEval/[openai]_react_cat_message

Score: 0.00
Rubric: The word/emoji is a cat emoji or a heart/love emoji
Reason: The output is the text "smile_cat", which is neither a cat emoji (e.g., 🐱/😺) nor a heart/love emoji (e.g., ❤️/😍).

Anthropic

✅ Overall: 19/19 tests passed (100.0%)

Provider	Total	Passed	Failed	Pass Rate
✅ ANTHROPIC	19	19	0	100.0%

This comment was automatically generated by the eval CI pipeline.

Stabilize disabled-tool real API e2e test

e404635

Co-authored-by: Christopher Speller <crspeller@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize disabled-tool real API CI test#587

Stabilize disabled-tool real API CI test#587
cursor[bot] wants to merge 1 commit intomasterfrom
cursor/ci-pipeline-failure-c8fa

cursor bot commented Apr 1, 2026

Uh oh!

github-actions bot commented Apr 1, 2026

OPENAI

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cursor bot commented Apr 1, 2026

Summary

Ticket Link

Screenshots

Release Note

Uh oh!

github-actions bot commented Apr 1, 2026

🤖 LLM Evaluation Results

OpenAI

❌ Failed Evaluations

OPENAI

Anthropic

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant