Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 17 additions & 8 deletions src/tools/actor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -302,7 +302,7 @@ const callActorArgs = z.object({
.describe('The name of the Actor to call. For example, "apify/rag-web-browser".'),
step: z.enum(['info', 'call'])
.default('info')
.describe(`Step to perform: "info" to get Actor details and input schema (required first step), "call" to execute the Actor (only after getting info).`),
.describe(`Step to perform: "info" to get Actor details and input schema (required first step), "call" to run the Actor (only after getting info).`),
input: z.object({}).passthrough()
.optional()
.describe(`The input JSON to pass to the Actor. For example, {"query": "apify", "maxResults": 5, "outputFormats": ["markdown"]}. Required only when step is "call".`),
Expand All @@ -325,15 +325,23 @@ export const callActor: ToolEntry = {
tool: {
name: HelperTools.ACTOR_CALL,
actorFullName: HelperTools.ACTOR_CALL,
description: `Call any Actor from Apify Store - two-step process
This tool uses a mandatory two-step process to safely call any Actor from the Apify store.
description: `Call any Actor from the Apify Store using a mandatory two-step workflow.
This ensures you first get the Actor’s input schema and details before executing it safely.

USAGE:
• ONLY for Actors that are NOT available as dedicated tools
• If a dedicated tool exists (e.g., ${actorNameToToolName('apify/rag-web-browser')}), use that instead
There are two ways to run Actors:
1. Dedicated Actor tools (e.g., ${actorNameToToolName('apify/rag-web-browser')}): These are pre-configured tools, offering a simpler and more direct experience.
2. Generic call-actor tool (${HelperTools.ACTOR_CALL}): Use this when a dedicated tool is not available or when you want to run any Actor dynamically. This tool is especially useful if you do not want to add specific tools or your client does not support dynamic tool registration.

**Important:**

MANDATORY TWO-STEP WORKFLOW:
A successful run returns a \`datasetId\` (the Actor's output stored as an Apify dataset) and a short preview of items.
To fetch the full output, use the ${HelperTools.ACTOR_OUTPUT_GET} tool with the \`datasetId\`.

USAGE:
- Always use dedicated tools when available (e.g., ${actorNameToToolName('apify/rag-web-browser')})
- Use the generic call-actor tool only if a dedicated tool does not exist for your Actor.

MANDATORY TWO-STEP-WORKFLOW:
Step 1: Get Actor Info (step="info", default)
- First call this tool with step="info" to get Actor details and input schema
- This returns the Actor description, documentation, and required input schema
Expand All @@ -344,7 +352,8 @@ Step 2: Call Actor (step="call")
- This calls and runs the Actor. It will create an output as an Apify dataset (with datasetId).
- This step returns a dataset preview, typically JSON-formatted tabular data.

The step parameter enforces this workflow - you cannot call an Actor without first getting its info.`,
EXAMPLES:
- user_input: Get instagram posts using apify/instagram-scraper`,
inputSchema: zodToJsonSchema(callActorArgs),
ajvValidate: ajv.compile({
...zodToJsonSchema(callActorArgs),
Expand Down
51 changes: 33 additions & 18 deletions src/tools/dataset.ts
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,17 @@ export const getDataset: ToolEntry = {
tool: {
name: HelperTools.DATASET_GET,
actorFullName: HelperTools.DATASET_GET,
description: 'Dataset is a collection of structured data created by an Actor run. '
+ 'Returns information about dataset object with metadata (itemCount, schema, fields, stats). '
+ `Fields describe the structure of the dataset and can be used to filter the data with the ${HelperTools.DATASET_GET_ITEMS} tool. `
+ 'Note: itemCount updates may have 5s delay.'
+ 'The dataset can be accessed with the dataset URL: GET: https://api.apify.com/v2/datasets/:datasetId',
description: `Get metadata for a dataset (collection of structured data created by an Actor run).
The results will include dataset details such as itemCount, schema, fields, and stats.
Use fields to understand structure for filtering with ${HelperTools.DATASET_GET_ITEMS}.
Note: itemCount updates may be delayed by up to ~5 seconds.

USAGE:
- Use when you need dataset metadata to understand its structure before fetching items.

EXAMPLES:
- user_input: Show info for dataset 8TtYhCwKzQeQk7dJx
- user_input: What fields does username~my-dataset have?`,
inputSchema: zodToJsonSchema(getDatasetArgs),
ajvValidate: ajv.compile(zodToJsonSchema(getDatasetArgs)),
call: async (toolArgs) => {
Expand All @@ -74,16 +80,18 @@ export const getDatasetItems: ToolEntry = {
tool: {
name: HelperTools.DATASET_GET_ITEMS,
actorFullName: HelperTools.DATASET_GET_ITEMS,
description: 'Returns dataset items with pagination support. '
+ 'Items can be sorted (newest to oldest) and filtered (clean mode skips empty items and hidden fields). '
+ 'Supports field selection - include specific fields or exclude unwanted ones using comma-separated lists. '
+ 'For nested objects, you must first flatten them using the flatten parameter before accessing their fields. '
+ 'Example: To get URLs from items like [{"metadata":{"url":"example.com"}}], '
+ 'use flatten="metadata" and then fields="metadata.url". '
+ 'The flattening transforms nested objects into dot-notation format '
+ '(e.g. {"metadata":{"url":"x"}} becomes {"metadata.url":"x"}). '
+ 'Retrieve only the fields you need, reducing the response size and improving performance. '
+ 'The response includes total count, offset, limit, and items array.',
description: `Retrieve dataset items with pagination, sorting, and field selection.
Use clean=true to skip empty items and hidden fields. Include or omit fields using comma-separated lists.
For nested objects, first flatten them (e.g., flatten="metadata"), then reference nested fields via dot notation (e.g., fields="metadata.url").

The results will include items along with pagination info (limit, offset) and total count.

USAGE:
- Use when you need to read data from a dataset (all items or only selected fields).

EXAMPLES:
- user_input: Get first 100 items from dataset 8TtYhCwKzQeQk7dJx
- user_input: Get only metadata.url and title from dataset username~my-dataset (flatten metadata)`,
inputSchema: zodToJsonSchema(getDatasetItemsArgs),
ajvValidate: ajv.compile(zodToJsonSchema(getDatasetItemsArgs)),
call: async (toolArgs) => {
Expand Down Expand Up @@ -136,9 +144,16 @@ export const getDatasetSchema: ToolEntry = {
tool: {
name: HelperTools.DATASET_SCHEMA_GET,
actorFullName: HelperTools.DATASET_SCHEMA_GET,
description: 'Generates a JSON schema from dataset items. '
+ 'The schema describes the structure of the data in the dataset, which can be used for validation, documentation, or data processing.'
+ 'Since the dataset can be large it is convenient to understand the structure of the dataset before getting dataset items.',
description: `Generate a JSON schema from a sample of dataset items.
The schema describes the structure of the data and can be used for validation, documentation, or processing.
Use this to understand the dataset before fetching many items.

USAGE:
- Use when you need to infer the structure of dataset items for downstream processing or validation.

EXAMPLES:
- user_input: Generate schema for dataset 8TtYhCwKzQeQk7dJx using 10 items
- user_input: Show schema of username~my-dataset (clean items only)`,
inputSchema: zodToJsonSchema(getDatasetSchemaArgs),
ajvValidate: ajv.compile(zodToJsonSchema(getDatasetSchemaArgs)),
call: async (toolArgs) => {
Expand Down
18 changes: 12 additions & 6 deletions src/tools/dataset_collection.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,18 @@ export const getUserDatasetsList: ToolEntry = {
tool: {
name: HelperTools.DATASET_LIST_GET,
actorFullName: HelperTools.DATASET_LIST_GET,
description: 'Lists datasets (collections of Actor run data). '
+ 'Actor runs automatically produce unnamed datasets (use unnamed=true to include these). '
+ 'Users can also create named datasets manually. '
+ 'Each dataset includes itemCount, access settings, and usage stats (readCount, writeCount). '
+ 'Results are sorted by createdAt in ascending order (use desc=true for descending). '
+ 'Supports pagination with limit (max 20) and offset parameters.',
description: `List datasets (collections of Actor run data) for the authenticated user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should we include that it is for authenticated user? I will always be authenticated user (even in case of Skyfire - by the PAY token)

Actor runs automatically produce unnamed datasets (set unnamed=true to include them). Users can also create named datasets.

The results will include datasets with itemCount, access settings, and usage stats, sorted by createdAt (ascending by default).
Use limit (max 20), offset, and desc to paginate and sort.

USAGE:
- Use when you need to browse available datasets (named or unnamed) to locate data.

EXAMPLES:
- user_input: List my last 10 datasets (newest first)
- user_input: List unnamed datasets`,
inputSchema: zodToJsonSchema(getUserDatasetsListArgs),
ajvValidate: ajv.compile(zodToJsonSchema(getUserDatasetsListArgs)),
call: async (toolArgs) => {
Expand Down
15 changes: 8 additions & 7 deletions src/tools/fetch-actor-details.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,16 +17,17 @@ export const fetchActorDetailsTool: ToolEntry = {
type: 'internal',
tool: {
name: HelperTools.ACTOR_GET_DETAILS,
description: `Get detailed information about an Actor by its ID or full name.
This tool returns title, description, URL, README (Actor's documentation), input schema, and usage statistics.
The Actor name is always composed of "username/name", for example, "apify/rag-web-browser".
Present Actor information in user-friendly format as an Actor card.
description: `Get detailed information about an Actor by its ID or full name (format: "username/name", e.g., "apify/rag-web-browser").
This returns the Actor’s title, description, URL, README (documentation), input schema, pricing/usage information, and basic stats.
Present the information in a user-friendly Actor card.

USAGE:
- Use when user asks about an Actor its details, description, input schema, etc.
- Use when a user asks about an Actor’s details, input schema, README, or how to use it.

EXAMPLES:
- user_input: How to use apify/rag-web-browser
- user_input: What is the input schema for apify/rag-web-browser,
- user_input: What is pricing of apify/instagram-scraper?`,
- user_input: What is the input schema for apify/rag-web-browser?
- user_input: What is the pricing for apify/instagram-scraper?`,
inputSchema: zodToJsonSchema(fetchActorDetailsToolArgsSchema),
ajvValidate: ajv.compile(zodToJsonSchema(fetchActorDetailsToolArgsSchema)),
call: async (toolArgs) => {
Expand Down
10 changes: 9 additions & 1 deletion src/tools/fetch-apify-docs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,15 @@ export const fetchApifyDocsTool: ToolEntry = {
type: 'internal',
tool: {
name: HelperTools.DOCS_FETCH,
description: `Apify documentation fetch tool. This tool allows you to fetch the full content of an Apify documentation page by its URL.`,
description: `Fetch the full content of an Apify documentation page by its URL.
Use this after finding a relevant page with the ${HelperTools.DOCS_SEARCH} tool.

USAGE:
- Use when you need the complete content of a specific docs page for detailed answers.

EXAMPLES:
- user_input: Fetch https://docs.apify.com/platform/actors/running#builds
- user_input: Fetch https://docs.apify.com/academy`,
args: fetchApifyDocsToolArgsSchema,
inputSchema: zodToJsonSchema(fetchApifyDocsToolArgsSchema),
ajvValidate: ajv.compile(zodToJsonSchema(fetchApifyDocsToolArgsSchema)),
Expand Down
20 changes: 12 additions & 8 deletions src/tools/get-actor-output.ts
Original file line number Diff line number Diff line change
Expand Up @@ -68,19 +68,23 @@ export const getActorOutput: ToolEntry = {
tool: {
name: HelperTools.ACTOR_OUTPUT_GET,
actorFullName: HelperTools.ACTOR_OUTPUT_GET,
description: `Fetch the dataset of a specific Actor run based on datasetId.
You can also retrieve only specific fields from the output if needed.
description: `Retrieve the output dataset items of a specific Actor run using its datasetId.
You can select specific fields to return (supports dot notation like "crawl.statusCode") and paginate results with offset and limit.
This tool is a simplified version of the get-dataset-items tool, focused on Actor run outputs.

The results will include the dataset items from the specified dataset. If you provide fields, only those fields will be included (nested fields supported via dot notation).

You can obtain the datasetId from an Actor run (e.g., after calling an Actor with the call-actor tool) or from the Apify Console (Runs → Run details → Dataset ID).

USAGE:
- Use this tool to get Actor dataset outside of the preview, or to access fields from the Actor output
dataset schema that are not included in the preview.
- Use when you need to read Actor output data (full items or selected fields), especially when preview does not include all fields.

EXAMPLES:
- user_input: Get data of my last Actor run?
- user_input: Get number_of_likes from my dataset?
- user_input: Get data of my last Actor run
- user_input: Get number_of_likes from my dataset
- user_input: Return only crawl.statusCode and url from dataset 8TtYhCwKzQeQk7dJx

Note: This tool is automatically included if the Apify MCP Server is configured with any Actor tools
(e.g. "apify-slash-rag-web-browser") or tools that can interact with Actors (e.g. "call-actor", "add-actor").`,
Note: This tool is automatically included if the Apify MCP Server is configured with any Actor tools (e.g., "apify-slash-rag-web-browser") or tools that can interact with Actors (e.g., "call-actor", "add-actor").`,
inputSchema: zodToJsonSchema(getActorOutputArgs),
/**
* Allow additional properties for Skyfire mode to pass `skyfire-pay-id`.
Expand Down
12 changes: 11 additions & 1 deletion src/tools/get-html-skeleton.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,17 @@ export const getHtmlSkeleton: ToolEntry = {
tool: {
name: HelperTools.GET_HTML_SKELETON,
actorFullName: HelperTools.GET_HTML_SKELETON,
description: `Retrieves the HTML skeleton (clean structure) from a given URL by stripping unwanted elements like scripts, styles, and non-essential attributes. This tool keeps only the core HTML structure, links, images, and data attributes for analysis. Supports optional JavaScript rendering for dynamic content and provides chunked output to handle large HTML. This tool is useful for building web scrapers and data extraction tasks where a clean HTML structure is needed for writing concrete selectors or parsers.`,
description: `Retrieve the HTML skeleton (clean structure) of a webpage by stripping scripts, styles, and non-essential attributes.
This keeps the core HTML structure, links, images, and data attributes for analysis. Supports optional JavaScript rendering for dynamic pages.

The results will include a chunked HTML skeleton if the content is large. Use the chunk parameter to paginate through the output.

USAGE:
- Use when you need a clean HTML structure to design selectors or parsers for scraping.

EXAMPLES:
- user_input: Get HTML skeleton for https://example.com
- user_input: Get next chunk of HTML skeleton for https://example.com (chunk=2)`,
inputSchema: zodToJsonSchema(getHtmlSkeletonArgs),
ajvValidate: ajv.compile(zodToJsonSchema(getHtmlSkeletonArgs)),
call: async (toolArgs) => {
Expand Down
17 changes: 11 additions & 6 deletions src/tools/helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,17 @@ export const addTool: ToolEntry = {
type: 'internal',
tool: {
name: HelperTools.ACTOR_ADD,
description: `Add an Actor or MCP server to the available tools of the Apify MCP server.\n`
+ 'A tool is an Actor or MCP server that can be called by the user.\n'
+ 'Do not execute the tool, only add it and list it in the available tools.\n'
+ 'For example, when a user wants to scrape a website, first search for relevant Actors\n'
+ `using ${HelperTools.STORE_SEARCH} tool, and once the user selects one they want to use,\n`
+ 'add it as a tool to the Apify MCP server.',
description: `Add an Actor or MCP server to the Apify MCP Server as an available tool.
This does not execute the Actor; it only registers it so it can be called later.

You can first discover Actors using the ${HelperTools.STORE_SEARCH} tool, then add the selected Actor as a tool.

USAGE:
- Use when a user has chosen an Actor to work with and you need to make it available as a callable tool.

EXAMPLES:
- user_input: Add apify/rag-web-browser as a tool
- user_input: Add apify/instagram-scraper as a tool`,
inputSchema: zodToJsonSchema(addToolArgsSchema),
ajvValidate: ajv.compile(zodToJsonSchema(addToolArgsSchema)),
// TODO: I don't like that we are passing apifyMcpServer and mcpServer to the tool
Expand Down
Loading