Skip to content

Commit 7ef726d

Browse files
MQ37MichalKalita
andauthored
feat: improve actor tool output (#260)
* feat: improve actor tool output * update readme * fix output tool, write test for that * add test based on Zuzka suggestion * lint * fix output response order so LLM does not lose the instructions * refactor: unify string list parsing logic * fix the tests - order of the Actor run response messages * Update src/utils/schema-generation.ts Co-authored-by: Michal Kalita <[email protected]> * address review comments * add get-actor-output tools note about when its loaded --------- Co-authored-by: Michal Kalita <[email protected]>
1 parent 279293f commit 7ef726d

29 files changed

+851
-274
lines changed

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,11 @@ Here is an overview list of all the tools provided by the Apify MCP Server.
169169
| `get-dataset-list` | storage | List all available datasets for the user. | |
170170
| `get-key-value-store-list`| storage | List all available key-value stores for the user. | |
171171
| `add-actor` | experimental | Add an Actor as a new tool for the user to call. | |
172+
| `get-actor-output`* | - | Retrieve the output from an Actor call which is not included in the output preview of the Actor tool. ||
173+
174+
> **Note:**
175+
>
176+
> The `get-actor-output` tool is automatically included with any Actor-related tool, such as `call-actor`, `add-actor`, or any specific Actor tool like `apify-slash-rag-web-browser`. When you call an Actor - either through the `call-actor` tool or directly via an Actor tool (e.g., `apify-slash-rag-web-browser`) - you receive a preview of the output. The preview depends on the Actor's output format and length; for some Actors and runs, it may include the entire output, while for others, only a limited version is returned to avoid overwhelming the LLM. To retrieve the full output of an Actor run, use the `get-actor-output` tool (supports limit, offset, and field filtering) with the `datasetId` provided by the Actor call.
172177
173178
### Tools configuration
174179

src/const.ts

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,15 @@ export const ACTOR_RUN_DATASET_OUTPUT_MAX_ITEMS = 5;
88
// Actor run const
99
export const ACTOR_MAX_MEMORY_MBYTES = 4_096; // If the Actor requires 8GB of memory, free users can't run actors-mcp-server and requested Actor
1010

11+
// Tool output
12+
/**
13+
* Usual tool output limit is 25k tokens, let's use 20k
14+
* just in case where 1 token =~ 4 characters thus 80k chars.
15+
* This is primarily used for Actor tool call output, but we can then
16+
* reuse this in other tools as well.
17+
*/
18+
export const TOOL_MAX_OUTPUT_CHARS = 80000;
19+
1120
// MCP Server
1221
export const SERVER_NAME = 'apify-mcp-server';
1322
export const SERVER_VERSION = '1.0.0';
@@ -18,9 +27,8 @@ export const USER_AGENT_ORIGIN = 'Origin/mcp-server';
1827
export enum HelperTools {
1928
ACTOR_ADD = 'add-actor',
2029
ACTOR_CALL = 'call-actor',
21-
ACTOR_GET = 'get-actor',
2230
ACTOR_GET_DETAILS = 'fetch-actor-details',
23-
ACTOR_REMOVE = 'remove-actor',
31+
ACTOR_OUTPUT_GET = 'get-actor-output',
2432
ACTOR_RUNS_ABORT = 'abort-actor-run',
2533
ACTOR_RUNS_GET = 'get-actor-run',
2634
ACTOR_RUNS_LOG = 'get-actor-log',
@@ -33,7 +41,6 @@ export enum HelperTools {
3341
KEY_VALUE_STORE_GET = 'get-key-value-store',
3442
KEY_VALUE_STORE_KEYS_GET = 'get-key-value-store-keys',
3543
KEY_VALUE_STORE_RECORD_GET = 'get-key-value-store-record',
36-
APIFY_MCP_HELP_TOOL = 'apify-actor-help-tool',
3744
STORE_SEARCH = 'search-actors',
3845
DOCS_SEARCH = 'search-apify-docs',
3946
DOCS_FETCH = 'fetch-apify-docs',
@@ -54,12 +61,12 @@ export const APIFY_DOCS_CACHE_MAX_SIZE = 500;
5461
export const APIFY_DOCS_CACHE_TTL_SECS = 60 * 60; // 1 hour
5562

5663
export const ACTOR_PRICING_MODEL = {
57-
/** Rental actors */
64+
/** Rental Actors */
5865
FLAT_PRICE_PER_MONTH: 'FLAT_PRICE_PER_MONTH',
5966
FREE: 'FREE',
60-
/** Pay per result (PPR) actors */
67+
/** Pay per result (PPR) Actors */
6168
PRICE_PER_DATASET_ITEM: 'PRICE_PER_DATASET_ITEM',
62-
/** Pay per event (PPE) actors */
69+
/** Pay per event (PPE) Actors */
6370
PAY_PER_EVENT: 'PAY_PER_EVENT',
6471
} as const;
6572

src/main.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,11 @@ if (STANDBY_MODE) {
4444
await Actor.fail('If you need to debug a specific Actor, please provide the debugActor and debugActorInput fields in the input');
4545
}
4646
const options = { memory: input.maxActorMemoryBytes } as ActorCallOptions;
47-
const result = await callActorGetDataset(input.debugActor!, input.debugActorInput!, process.env.APIFY_TOKEN, options);
47+
const callResult = await callActorGetDataset(input.debugActor!, input.debugActorInput!, process.env.APIFY_TOKEN, options);
4848

49-
if (result && result.items) {
50-
await Actor.pushData(result.items);
51-
log.info('Pushed items to dataset', { itemCount: result.items.count });
49+
if (callResult && callResult.previewItems.length > 0) {
50+
await Actor.pushData(callResult.previewItems);
51+
log.info('Pushed items to dataset', { itemCount: callResult.previewItems.length });
5252
}
5353
await Actor.exit();
5454
}

src/mcp/actors.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ import type { ActorDefinition } from 'apify-client';
33
import { ApifyClient } from '../apify-client.js';
44
import { MCP_STREAMABLE_ENDPOINT } from '../const.js';
55
import type { ActorDefinitionPruned } from '../types.js';
6+
import { parseCommaSeparatedList } from '../utils/generic.js';
67

78
/**
89
* Returns the MCP server path for the given Actor ID.
@@ -13,7 +14,7 @@ export function getActorMCPServerPath(actorDefinition: ActorDefinition | ActorDe
1314
if ('webServerMcpPath' in actorDefinition && typeof actorDefinition.webServerMcpPath === 'string') {
1415
const webServerMcpPath = actorDefinition.webServerMcpPath.trim();
1516

16-
const paths = webServerMcpPath.split(',').map((path) => path.trim());
17+
const paths = parseCommaSeparatedList(webServerMcpPath);
1718
// If there is only one path, return it directly
1819
if (paths.length === 1) {
1920
return paths[0];

src/mcp/proxy.ts

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
import type { Client } from '@modelcontextprotocol/sdk/client/index.js';
2-
import Ajv from 'ajv';
32

43
import { fixedAjvCompile } from '../tools/utils.js';
54
import type { ActorMcpTool, ToolEntry } from '../types.js';
5+
import { ajv } from '../utils/ajv.js';
66
import { getMCPServerID, getProxyMCPServerToolName } from './utils.js';
77

88
export async function getMCPServerTools(
@@ -14,8 +14,6 @@ export async function getMCPServerTools(
1414
const res = await client.listTools();
1515
const { tools } = res;
1616

17-
const ajv = new Ajv({ coerceTypes: 'array', strict: false });
18-
1917
const compiledTools: ToolEntry[] = [];
2018
for (const tool of tools) {
2119
const mcpTool: ActorMcpTool = {

src/mcp/server.ts

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ import { prompts } from '../prompts/index.js';
3131
import { callActorGetDataset, defaultTools, getActorsAsTools, toolCategories } from '../tools/index.js';
3232
import { decodeDotPropertyNames } from '../tools/utils.js';
3333
import type { ActorMcpTool, ActorTool, HelperTool, ToolEntry } from '../types.js';
34+
import { buildActorResponseContent } from '../utils/actor-response.js';
3435
import { createProgressTracker } from '../utils/progress.js';
3536
import { getToolPublicFieldOnly } from '../utils/tools.js';
3637
import { connectMCPClient } from './client.js';
@@ -524,7 +525,7 @@ export class ActorsMcpServer {
524525

525526
try {
526527
log.info('Calling Actor', { actorName: actorTool.actorFullName, input: args });
527-
const result = await callActorGetDataset(
528+
const callResult = await callActorGetDataset(
528529
actorTool.actorFullName,
529530
args,
530531
apifyToken as string,
@@ -533,22 +534,13 @@ export class ActorsMcpServer {
533534
extra.signal,
534535
);
535536

536-
if (!result) {
537+
if (!callResult) {
537538
// Receivers of cancellation notifications SHOULD NOT send a response for the cancelled request
538539
// https://modelcontextprotocol.io/specification/2025-06-18/basic/utilities/cancellation#behavior-requirements
539540
return { };
540541
}
541542

542-
const { runId, datasetId, items } = result;
543-
544-
const content = [
545-
{ type: 'text', text: `Actor finished with runId: ${runId}, datasetId ${datasetId}` },
546-
];
547-
548-
const itemContents = items.items.map((item: Record<string, unknown>) => {
549-
return { type: 'text', text: JSON.stringify(item) };
550-
});
551-
content.push(...itemContents);
543+
const content = buildActorResponseContent(actorTool.actorFullName, callResult);
552544
return { content };
553545
} finally {
554546
if (progressTracker) {

src/stdio.ts

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ import log from '@apify/log';
2525
import { processInput } from './input.js';
2626
import { ActorsMcpServer } from './mcp/server.js';
2727
import type { Input, ToolSelector } from './types.js';
28+
import { parseCommaSeparatedList } from './utils/generic.js';
2829
import { loadToolsFromInput } from './utils/tools-loader.js';
2930

3031
// Keeping this interface here and not types.ts since
@@ -86,13 +87,9 @@ For more details visit https://mcp.apify.com`,
8687
// Respect either the new flag or the deprecated one
8788
const enableAddingActors = Boolean(argv.enableAddingActors || argv.enableActorAutoLoading);
8889
// Split actors argument, trim whitespace, and filter out empty strings
89-
const actorList = argv.actors !== undefined
90-
? argv.actors.split(',').map((a: string) => a.trim()).filter((a: string) => a.length > 0)
91-
: undefined;
90+
const actorList = argv.actors !== undefined ? parseCommaSeparatedList(argv.actors) : undefined;
9291
// Split tools argument, trim whitespace, and filter out empty strings
93-
const toolCategoryKeys = argv.tools !== undefined
94-
? argv.tools.split(',').map((t: string) => t.trim()).filter((t: string) => t.length > 0)
95-
: undefined;
92+
const toolCategoryKeys = argv.tools !== undefined ? parseCommaSeparatedList(argv.tools) : undefined;
9693

9794
// Propagate log.error to console.error for easier debugging
9895
const originalError = log.error.bind(log);

src/tools/actor.ts

Lines changed: 46 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
import type { Client } from '@modelcontextprotocol/sdk/client/index.js';
2-
import { Ajv } from 'ajv';
3-
import type { ActorCallOptions, ActorRun, PaginatedList } from 'apify-client';
2+
import type { ActorCallOptions, ActorRun } from 'apify-client';
43
import { z } from 'zod';
54
import zodToJsonSchema from 'zod-to-json-schema';
65

@@ -11,37 +10,43 @@ import {
1110
ACTOR_ADDITIONAL_INSTRUCTIONS,
1211
ACTOR_MAX_MEMORY_MBYTES,
1312
HelperTools,
13+
TOOL_MAX_OUTPUT_CHARS,
1414
} from '../const.js';
1515
import { getActorMCPServerPath, getActorMCPServerURL } from '../mcp/actors.js';
1616
import { connectMCPClient } from '../mcp/client.js';
1717
import { getMCPServerTools } from '../mcp/proxy.js';
1818
import { actorDefinitionPrunedCache } from '../state.js';
19-
import type { ActorDefinitionStorage, ActorInfo, ToolEntry } from '../types.js';
20-
import { getActorDefinitionStorageFieldNames } from '../utils/actor.js';
19+
import type { ActorDefinitionStorage, ActorInfo, DatasetItem, ToolEntry } from '../types.js';
20+
import { ensureOutputWithinCharLimit, getActorDefinitionStorageFieldNames } from '../utils/actor.js';
2121
import { fetchActorDetails } from '../utils/actor-details.js';
22-
import { getValuesByDotKeys } from '../utils/generic.js';
22+
import { buildActorResponseContent } from '../utils/actor-response.js';
23+
import { ajv } from '../utils/ajv.js';
2324
import type { ProgressTracker } from '../utils/progress.js';
25+
import type { JsonSchemaProperty } from '../utils/schema-generation.js';
26+
import { generateSchemaFromItems } from '../utils/schema-generation.js';
2427
import { getActorDefinition } from './build.js';
2528
import { actorNameToToolName, fixedAjvCompile, getToolSchemaID, transformActorInputSchemaProperties } from './utils.js';
2629

27-
const ajv = new Ajv({ coerceTypes: 'array', strict: false });
28-
2930
// Define a named return type for callActorGetDataset
3031
export type CallActorGetDatasetResult = {
3132
runId: string;
3233
datasetId: string;
33-
items: PaginatedList<Record<string, unknown>>;
34+
itemCount: number;
35+
schema: JsonSchemaProperty;
36+
previewItems: DatasetItem[];
3437
};
3538

3639
/**
37-
* Calls an Apify actor and retrieves the dataset items.
40+
* Calls an Apify Actor and retrieves metadata about the dataset results.
3841
*
42+
* This function executes an Actor and returns summary information instead with a result items preview of the full dataset
43+
* to prevent overwhelming responses. The actual data can be retrieved using the get-actor-output tool.
3944
*
4045
* It requires the `APIFY_TOKEN` environment variable to be set.
4146
* If the `APIFY_IS_AT_HOME` the dataset items are pushed to the Apify dataset.
4247
*
43-
* @param {string} actorName - The name of the actor to call.
44-
* @param {ActorCallOptions} callOptions - The options to pass to the actor.
48+
* @param {string} actorName - The name of the Actor to call.
49+
* @param {ActorCallOptions} callOptions - The options to pass to the Actor.
4550
* @param {unknown} input - The input to pass to the actor.
4651
* @param {string} apifyToken - The Apify token to use for authentication.
4752
* @param {ProgressTracker} progressTracker - Optional progress tracker for real-time updates.
@@ -58,6 +63,7 @@ export async function callActorGetDataset(
5863
abortSignal?: AbortSignal,
5964
): Promise<CallActorGetDatasetResult | null> {
6065
const CLIENT_ABORT = Symbol('CLIENT_ABORT'); // Just internal symbol to identify client abort
66+
// TODO: we should remove this throw, we are just catching and then rethrowing with generic message
6167
try {
6268
const client = new ApifyClient({ token: apifyToken });
6369
const actorClient = client.actor(actorName);
@@ -98,34 +104,45 @@ export async function callActorGetDataset(
98104

99105
// Process the completed run
100106
const dataset = client.dataset(completedRun.defaultDatasetId);
101-
const [items, defaultBuild] = await Promise.all([
107+
const [datasetItems, defaultBuild] = await Promise.all([
102108
dataset.listItems(),
103109
(await actorClient.defaultBuild()).get(),
104110
]);
105111

106-
// Get important properties from storage view definitions and if available return only those properties
112+
// Generate schema using the shared utility
113+
const generatedSchema = generateSchemaFromItems(datasetItems.items, {
114+
clean: true,
115+
arrayMode: 'all',
116+
});
117+
const schema = generatedSchema || { type: 'object', properties: {} };
118+
119+
/**
120+
* Get important fields that are using in any dataset view as they MAY be used in filtering to ensure the output fits
121+
* the tool output limits. Client has to use the get-actor-output tool to retrieve the full dataset or filtered out fields.
122+
*/
107123
const storageDefinition = defaultBuild?.actorDefinition?.storages?.dataset as ActorDefinitionStorage | undefined;
108124
const importantProperties = getActorDefinitionStorageFieldNames(storageDefinition || {});
109-
if (importantProperties.length > 0) {
110-
items.items = items.items.map((item) => {
111-
return getValuesByDotKeys(item, importantProperties);
112-
});
113-
}
114-
115-
log.debug('Actor finished', { actorName, itemCount: items.count });
116-
return { runId: actorRun.id, datasetId: completedRun.defaultDatasetId, items };
125+
const previewItems = ensureOutputWithinCharLimit(datasetItems.items, importantProperties, TOOL_MAX_OUTPUT_CHARS);
126+
127+
return {
128+
runId: actorRun.id,
129+
datasetId: completedRun.defaultDatasetId,
130+
itemCount: datasetItems.count,
131+
schema,
132+
previewItems,
133+
};
117134
} catch (error) {
118-
log.error('Error calling actor', { error, actorName, input });
135+
log.error('Error calling Actor', { error, actorName, input });
119136
throw new Error(`Error calling Actor: ${error}`);
120137
}
121138
}
122139

123140
/**
124141
* This function is used to fetch normal non-MCP server Actors as a tool.
125142
*
126-
* Fetches actor input schemas by Actor IDs or Actor full names and creates MCP tools.
143+
* Fetches Actor input schemas by Actor IDs or Actor full names and creates MCP tools.
127144
*
128-
* This function retrieves the input schemas for the specified actors and compiles them into MCP tools.
145+
* This function retrieves the input schemas for the specified Actors and compiles them into MCP tools.
129146
* It uses the AJV library to validate the input schemas.
130147
*
131148
* Tool name can't contain /, so it is replaced with _
@@ -228,7 +245,7 @@ export async function getActorsAsTools(
228245
actorIdsOrNames: string[],
229246
apifyToken: string,
230247
): Promise<ToolEntry[]> {
231-
log.debug('Fetching actors as tools', { actorNames: actorIdsOrNames });
248+
log.debug('Fetching Actors as tools', { actorNames: actorIdsOrNames });
232249

233250
const actorsInfo: (ActorInfo | null)[] = await Promise.all(
234251
actorIdsOrNames.map(async (actorIdOrName) => {
@@ -325,7 +342,7 @@ The step parameter enforces this workflow - you cannot call an Actor without fir
325342

326343
try {
327344
if (step === 'info') {
328-
// Step 1: Return actor card and schema directly
345+
// Step 1: Return Actor card and schema directly
329346
const details = await fetchActorDetails(apifyToken, actorName);
330347
if (!details) {
331348
return {
@@ -369,7 +386,7 @@ The step parameter enforces this workflow - you cannot call an Actor without fir
369386
}
370387
}
371388

372-
const result = await callActorGetDataset(
389+
const callResult = await callActorGetDataset(
373390
actorName,
374391
input,
375392
apifyToken,
@@ -378,23 +395,13 @@ The step parameter enforces this workflow - you cannot call an Actor without fir
378395
extra.signal,
379396
);
380397

381-
if (!result) {
398+
if (!callResult) {
382399
// Receivers of cancellation notifications SHOULD NOT send a response for the cancelled request
383400
// https://modelcontextprotocol.io/specification/2025-06-18/basic/utilities/cancellation#behavior-requirements
384401
return { };
385402
}
386403

387-
const { runId, datasetId, items } = result;
388-
389-
const content = [
390-
{ type: 'text', text: `Actor finished with runId: ${runId}, datasetId ${datasetId}` },
391-
];
392-
393-
const itemContents = items.items.map((item: Record<string, unknown>) => ({
394-
type: 'text',
395-
text: JSON.stringify(item),
396-
}));
397-
content.push(...itemContents);
404+
const content = buildActorResponseContent(actorName, callResult);
398405

399406
return { content };
400407
} catch (error) {

src/tools/build.ts

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
import { Ajv } from 'ajv';
21
import { z } from 'zod';
32
import zodToJsonSchema from 'zod-to-json-schema';
43

@@ -13,10 +12,9 @@ import type {
1312
ISchemaProperties,
1413
ToolEntry,
1514
} from '../types.js';
15+
import { ajv } from '../utils/ajv.js';
1616
import { filterSchemaProperties, shortenProperties } from './utils.js';
1717

18-
const ajv = new Ajv({ coerceTypes: 'array', strict: false });
19-
2018
/**
2119
* Get Actor input schema by Actor name.
2220
* First, fetch the Actor details to get the default build tag and buildId.

0 commit comments

Comments
 (0)