New Components - mistral_ai #16155

michelle0927 · 2025-04-02T21:10:57Z

Resolves #16107

Note to QA (@vunguyenhung): I was unable to create a batch job in the UI, and the API endpoint to create a batch job returns this error for me:
You cannot launch batch jobs this big with your free trial.
Therefore, create-batch-job, new-batch-job-completed, and new-batch-job-failed haven't been fully tested.

Summary by CodeRabbit

New Features
- Introduced new actions for creating batch jobs, generating text responses, creating embeddings, downloading batch job results, and uploading files.
- Enabled listing available AI models and retrieving batch job details with enhanced configuration for file and model selection.
- Rolled out new event sources for notifications on batch job completions, failures, and new model additions.
- Added constants for default pagination size and batch job status options.
Chores
- Updated the component version and dependencies to improve overall stability and compatibility.

vercel · 2025-04-02T21:11:02Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
docs-v2	⬜️ Ignored (Inspect)	Visit Preview	Apr 8, 2025 4:57pm
pipedream-docs	⬜️ Ignored (Inspect)		Apr 8, 2025 4:57pm
pipedream-docs-redirect-do-not-edit	⬜️ Ignored (Inspect)		Apr 8, 2025 4:57pm

coderabbitai · 2025-04-02T21:11:07Z

Walkthrough

This PR introduces multiple new modules and enhancements to the Mistral AI component. New actions for batch job creation, embedding creation, text generation, and model listing have been added. Additional sources are implemented to emit events for new model registrations, batch job completions, and failures. Common constants and utility functions have been introduced, and the primary application has been strengthened with new properties, methods for API interactions, and pagination support. The package version was updated, and a dependency on "@pipedream/platform" was added.

Changes

Files	Change Summary
`components/mistral_ai/actions/…/create-batch-job.mjs`, `…/create-embeddings.mjs`, `…/generate-text.mjs`, `…/list-models.mjs`	Added new action modules for batch job creation, embedding creation, text generation, and model listing.
`components/mistral_ai/common/constants.mjs`, `components/mistral_ai/common/utils.mjs`	Introduced constants and utility functions (`parseArray`, `parseObj`) to support the component’s operations.
`components/mistral_ai/mistral_ai.app.mjs`	Enhanced the app with new properties (`fileIds`, `modelId`, `batchJobId`) and methods (`_baseUrl`, `_makeRequest`, `listModels`, etc.) including paginated requests.
`components/mistral_ai/package.json`	Updated version from `"0.0.1"` to `"0.1.0"` and added a dependency on `"@pipedream/platform": "^3.0.3"`.
`components/mistral_ai/sources/…/base.mjs`, `…/new-batch-job-completed/new-batch-job-completed.mjs`, `…/new-batch-job-failure/new-batch-job-failure.mjs`, `…/new-model-added/new-model-added.mjs`, `…/new-model-added/test-event.mjs`	Added new source modules for emitting events on batch job completions, failures, and new model registrations, with a common base for event processing.
`components/mistral_ai/actions/…/download-batch-job-results.mjs`, `…/get-batch-job-details.mjs`, `…/upload-file.mjs`	Introduced new actions for downloading batch job results, retrieving batch job details, and uploading files to the Mistral AI service.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant A as Generate Text Action
    participant M as Mistral AI API
    U->>A: Initiate text generation (send prompt)
    A->>M: Call sendPrompt with parameters
    M-->>A: Return response with ID
    A-->>U: Return full response & summary

sequenceDiagram
    participant T as Timer/Trigger
    participant N as New Batch Job Completed Source
    participant M as Mistral AI API
    T->>N: Trigger event processing
    N->>M: Call listBatchJobs to fetch jobs
    M-->>N: Return job list
    N->>N: Filter & generate metadata for completed jobs
    N-->>T: Emit event with job details

Assessment against linked issues

Objective	Addressed	Explanation
Emit new events for AI model registration, batch job completions, and job failures (#16107: new-model-added, new-batch-job-completed, new-batch-job-failure)	✅
Provide actions for text generation, model listing, and embedding creation (#16107: generate-text, list-models, create-embeddings)	✅

Suggested labels

ai-assisted

Poem

I hopped along the code so bright,
With actions and sources in clear sight.
Batch jobs and models now take flight,
New events emerge in the night.
My whiskers twitch with programming delight,
Each change a hop, precise and light!
🐰✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at packageResolve (node:internal/modules/esm/resolve:839:9)
at moduleResolve (node:internal/modules/esm/resolve:908:18)
at defaultResolve (node:internal/modules/esm/resolve:1038:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:557:12)
at ModuleLoader.resolve (node:internal/modules/esm/loader:525:25)
at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:246:38)
at ModuleJob._link (node:internal/modules/esm/module_job:126:49)

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (17)

components/mistral_ai/actions/create-embeddings/create-embeddings.mjs (1)
19-26: Consider adding error handling for API calls.

The action would benefit from explicit error handling to provide better feedback when API calls fail.
 async run({ $ }) {
-    const response = await this.mistralAI.createEmbeddings({
-      $,
-      data: {
-        model: constants.EMBEDDINGS_MODEL,
-        input: parseArray(this.input),
-      },
-    });
+    try {
+      const response = await this.mistralAI.createEmbeddings({
+        $,
+        data: {
+          model: constants.EMBEDDINGS_MODEL,
+          input: parseArray(this.input),
+        },
+      });
+      if (response?.id) {
+        $.export("$summary", `Successfully created embedding with ID: ${response.id}`);
+      }
+      return response;
+    } catch (error) {
+      $.export("error", error);
+      throw `Error creating embeddings: ${error.message}`;
+    }
components/mistral_ai/actions/create-batch-job/create-batch-job.mjs (2)
45-59: Add error handling for robust user experience.

Similar to the create-embeddings action, this would benefit from explicit error handling to provide better feedback to users when API calls fail.
 async run({ $ }) {
-    const response = await this.mistralAI.createBatchJob({
-      $,
-      data: {
-        input_files: this.inputFiles,
-        endpoint: this.endpoint,
-        model: this.modelId,
-        metadata: parseObj(this.metadata),
-        timeoutHours: this.timeoutHours,
-      },
-    });
-    if (response?.id) {
-      $.export("$summary", `Successfully created batch job with ID: ${response.id}`);
-    }
-    return response;
+    try {
+      const response = await this.mistralAI.createBatchJob({
+        $,
+        data: {
+          input_files: this.inputFiles,
+          endpoint: this.endpoint,
+          model: this.modelId,
+          metadata: parseObj(this.metadata),
+          timeoutHours: this.timeoutHours,
+        },
+      });
+      if (response?.id) {
+        $.export("$summary", `Successfully created batch job with ID: ${response.id}`);
+      }
+      return response;
+    } catch (error) {
+      $.export("error", error);
+      throw `Error creating batch job: ${error.message}`;
+    }
8-8: Note testing limitation mentioned in PR description.

As mentioned in the PR objectives, the batch job functionality couldn't be fully tested due to trial account limitations. Consider adding a note in the documentation or code comments about this limitation.
-  description: "Create a new batch job, it will be queued for processing. [See the Documentation](https://docs.mistral.ai/api/#tag/batch/operation/jobs_api_routes_batch_get_batch_jobs)",
+  description: "Create a new batch job, it will be queued for processing. Note: This may require a paid account to use. [See the Documentation](https://docs.mistral.ai/api/#tag/batch/operation/jobs_api_routes_batch_get_batch_jobs)",
components/mistral_ai/common/utils.mjs (3)
1-15: Consider adding JSDoc comments to improve code documentation

This utility function's purpose is clear but would benefit from JSDoc comments explaining its parameters, return values, and expected behavior. This would help other developers understand when and how to use it.
+/**
+ * Parses a value into an array if it's a JSON string
+ * @param {any} arr - The value to parse
+ * @return {any} The parsed array, the original value, or undefined
+ */
function parseArray(arr) {
  if (!arr) {
    return undefined;
  }

  if (typeof arr === "string") {
    try {
      return JSON.parse(arr);
    } catch {
      return arr;
    }
  }

  return arr;
}
17-31: Consider adding JSDoc comments and type validation

Similar to the parseArray function, this function would benefit from JSDoc comments. Additionally, consider adding type validation to ensure the parsed result is actually an object.
+/**
+ * Parses a value into an object if it's a JSON string
+ * @param {any} obj - The value to parse
+ * @return {any} The parsed object, the original value, or undefined
+ */
function parseObj(obj) {
  if (!obj) {
    return undefined;
  }

  if (typeof obj === "string") {
    try {
-      return JSON.parse(obj);
+      const parsed = JSON.parse(obj);
+      // Validate that the parsed result is an object
+      return (parsed !== null && typeof parsed === 'object' && !Array.isArray(parsed)) 
+        ? parsed 
+        : obj;
    } catch {
      return obj;
    }
  }

  return obj;
}
1-36: Consider combining similar functions to reduce duplication

Both parseArray and parseObj have almost identical logic. Consider refactoring to reduce code duplication by creating a single parsing function with a type parameter.
/**
 * Parses a value into the specified type if it's a JSON string
 * @param {any} value - The value to parse
 * @param {string} expectedType - The expected type ('array' or 'object')
 * @return {any} The parsed value, the original value, or undefined
 */
function parseValue(value, expectedType) {
  if (!value) {
    return undefined;
  }

  if (typeof value === "string") {
    try {
      const parsed = JSON.parse(value);
      // Validate type if specified
      if (expectedType === 'array' && !Array.isArray(parsed)) {
        return value;
      }
      if (expectedType === 'object' && (typeof parsed !== 'object' || Array.isArray(parsed) || parsed === null)) {
        return value;
      }
      return parsed;
    } catch {
      return value;
    }
  }

  return value;
}

function parseArray(arr) {
  return parseValue(arr, 'array');
}

function parseObj(obj) {
  return parseValue(obj, 'object');
}

export {
  parseArray,
  parseObj,
};
components/mistral_ai/actions/list-models/list-models.mjs (1)
6-6: Description inconsistency with action name

The action name is "List Models" but the description starts with "Retrieve a list of available Mistral AI models..." which is redundant. Consider rephrasing to be more concise.
-  description: "Retrieve a list of available Mistral AI models that the user is authorized to access. [See the Documentation](https://docs.mistral.ai/api/#tag/models)",
+  description: "Returns all available Mistral AI models that the user is authorized to access. [See the Documentation](https://docs.mistral.ai/api/#tag/models)",
components/mistral_ai/sources/new-model-added/new-model-added.mjs (1)
20-26: Consider adding defensive checks in generateMeta

To prevent potential runtime errors if a model object is missing expected properties, consider adding defensive checks.
  generateMeta(model) {
+   if (!model) {
+     throw new Error("Missing model data");
+   }
    return {
-     id: model.id,
-     summary: `New Model: ${model.name}`,
-     ts: model.created,
+     id: model.id || "unknown_id",
+     summary: `New Model: ${model.name || "Unnamed"}`,
+     ts: model.created || Date.now(),
    };
  },
components/mistral_ai/common/constants.mjs (1)
1-26: LGTM - Constants are well-defined and organized

The constants are clearly named and appropriately categorized for their intended use. They provide useful configuration values that will be referenced throughout the Mistral AI component.

Consider adding JSDoc comments above each constant to document their purpose and usage, which would enhance maintainability:
+/**
+ * Default page size for pagination when fetching results from the Mistral AI API
+ */
const DEFAULT_PAGE_SIZE = 100;

+/**
+ * Default model to use for embedding operations
+ */
const EMBEDDINGS_MODEL = "mistral-embed";
components/mistral_ai/sources/new-batch-job-completed/new-batch-job-completed.mjs (1)
1-41: Component implementation looks good

The source component is well-structured and follows best practices for Pipedream components. It properly extends the common base, handles deduplication, and provides clear metadata.

One minor enhancement suggestion for the generateMeta method:
generateMeta(job) {
  return {
    id: job.id,
-    summary: `New Batch Job ${job.status} with ID: ${job.id}`,
+    summary: `Batch Job Completed: ${job.id} (Status: ${job.status})`,
    ts: job.created_at,
  };
},
This makes the summary more descriptive and consistent with the component's purpose, especially since this component can emit events for jobs with various status values, not just "SUCCESS".
components/mistral_ai/sources/new-batch-job-failure/new-batch-job-failure.mjs (1)
1-30: Implementation looks good but could benefit from consistent imports

The source component is well-structured and correctly implements the required methods for handling failed batch jobs.

Unlike the new-batch-job-completed component, this one doesn't import the constants module. While this is fine since it doesn't use any constants (the status is hardcoded as "FAILED"), for consistency across the codebase, consider:
import common from "../common/base.mjs";
+import constants from "../../common/constants.mjs";

export default {
  ...common,
  key: "mistral_ai-new-batch-job-failure",
  name: "New Batch Job Failure",
  // ...rest of the code
  getParams() {
    return {
-      status: "FAILED",
+      status: constants.BATCH_JOB_STATUS_OPTIONS[1], // "FAILED"
      created_after: this._getLastTs(),
    };
  },
This ensures that if the status values ever change, your component will automatically use the updated value.
components/mistral_ai/sources/common/base.mjs (3)
38-57: Add error handling for the processEvent method

The processEvent method assumes that when paginated is false, the function returns an object with a data property. This could lead to runtime errors if the function returns a different structure.

Consider adding error handling:
async processEvent(max) {
  const fn = this.getResourceFn();
  const params = this.getParams();
  const paginated = this.isPaginated();

  let results;
  if (paginated) {
    results = await this.getPaginatedResults(fn, params, max);
    this.findAndSetLastTs(results);
  } else {
-    results = (await fn({
-      params,
-    })).data;
+    try {
+      const response = await fn({
+        params,
+      });
+      results = response.data || [];
+      if (!Array.isArray(results)) {
+        console.log("Warning: Expected array of results, got:", typeof results);
+        results = [];
+      }
+    } catch (error) {
+      console.error("Error fetching results:", error);
+      results = [];
+    }
  }

  results.forEach((result) => {
    const meta = this.generateMeta(result);
    this.$emit(result, meta);
  });
}
58-64: Add check for created_at property in findAndSetLastTs

The findAndSetLastTs method assumes that each result has a created_at property. If a result doesn't have this property, it could lead to unexpected behavior.

Consider adding a check:
findAndSetLastTs(results) {
  let maxTs = this._getLastTs();
  for (const result of results) {
-    maxTs = Math.max(result.created_at, maxTs);
+    if (result && 'created_at' in result) {
+      maxTs = Math.max(result.created_at, maxTs);
+    }
  }
  this._setLastTs(maxTs);
},
24-37: Consider optimizing the getPaginatedResults method

Currently, the method creates an iterable with paginate but then manually builds an array from it. This approach works but could be simplified if the paginate method provides a way to get all results at once.

If this.mistralAI.paginate returns an async iterable (which appears to be the case), the current implementation is fine. However, if there's a method to get all results at once, consider using that instead to simplify the code.
components/mistral_ai/mistral_ai.app.mjs (3)

7-26: Handle potential errors when listing files.

The fileIds property’s async options method calls listFiles directly. Consider adding defensive error handling (e.g., try/catch) or fallback logic to gracefully handle API failures or partial data retrieval. This ensures a better user experience if the Mistral API request fails.

43-59: Central request logic is straightforward; consider timeouts or retries.

The _makeRequest method provides a clean, centralized wrapper around axios. For more robustness, consider adding:

Custom timeouts

Retry logic (especially for transient 5xx or network errors)

Logging or error transformation

85-91: Prompt handling is valid; consider logging prompt usage if needed.

The sendPrompt method is correct. If there’s a need for compliance or usage tracking, you might consider logging or capturing metadata before sending.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d8a832 and dd3a641.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (13)

components/mistral_ai/actions/create-batch-job/create-batch-job.mjs (1 hunks)
components/mistral_ai/actions/create-embeddings/create-embeddings.mjs (1 hunks)
components/mistral_ai/actions/generate-text/generate-text.mjs (1 hunks)
components/mistral_ai/actions/list-models/list-models.mjs (1 hunks)
components/mistral_ai/common/constants.mjs (1 hunks)
components/mistral_ai/common/utils.mjs (1 hunks)
components/mistral_ai/mistral_ai.app.mjs (1 hunks)
components/mistral_ai/package.json (2 hunks)
components/mistral_ai/sources/common/base.mjs (1 hunks)
components/mistral_ai/sources/new-batch-job-completed/new-batch-job-completed.mjs (1 hunks)
components/mistral_ai/sources/new-batch-job-failure/new-batch-job-failure.mjs (1 hunks)
components/mistral_ai/sources/new-model-added/new-model-added.mjs (1 hunks)
components/mistral_ai/sources/new-model-added/test-event.mjs (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: pnpm publish
GitHub Check: Verify TypeScript components
GitHub Check: Publish TypeScript components

🔇 Additional comments (17)

components/mistral_ai/package.json (2)

3-3: Version bump is appropriate for new features added.

The version increment from 0.0.1 to 0.1.0 correctly follows semantic versioning practices, reflecting the significant new functionality being added (batch jobs, embeddings, etc.) while maintaining backward compatibility.

15-16: Dependency addition looks good.

Adding the @pipedream/platform dependency is necessary to support the new functionality being implemented. The version constraint (^3.0.3) appropriately allows for compatible updates.
components/mistral_ai/sources/new-model-added/test-event.mjs (2)

1-23: Test event structure looks good, but timestamp is in the future.

The test event correctly models a Mistral AI model configuration, but the created timestamp (1743621323) corresponds to a date in August 2025, which is in the future. This could cause issues with date-based validations or comparisons.

Consider updating the timestamp to a past or current date:
-  "created": 1743621323,
+  "created": 1713621323,
6-13:
✅ Verification successful

Verify model capabilities are accurate.

Ensure that the capabilities object accurately reflects the actual capabilities of the mistral-ocr-2503 model. Currently, all capabilities are set to false, which might be correct for an OCR-specific model, but should be verified against the official Mistral AI documentation.

🌐 Web query:
What are the capabilities of Mistral AI OCR models?
💡 Result:

Mistral AI's OCR models represent a significant advancement in document processing, combining high-speed optical character recognition with advanced AI-driven understanding. These models excel at extracting structured data from complex documents while preserving layout, formatting, and multimedia elements. Here are their key capabilities:

High Accuracy & Layout Preservation

Mistral OCR achieves 94.89% overall accuracy in benchmarks, outperforming Google Document AI, Azure OCR, and GPT-4o in critical tasks[2][5][7]. Key features:

Maintains document hierarchy (headers, paragraphs, tables) in Markdown/JSON outputs[3][10]

Handles multi-column layouts, LaTeX equations, and mixed text/graphics[2][5][9]

Accurately extracts tables (94.3% accuracy) and mathematical expressions (96.1%)[2][7]

Multilingual & Multimodal Processing

The models support 2,000+ languages and scripts, with exceptional performance across diverse linguistic environments[1][5][10]:

Language Accuracy Score

Russian 99.09%

French 99.20%

Hindi 97.55%

Chinese 97.11%

Can process interleaved content types:

Text, images, tables, and equations in single documents[9][10]

Scanned documents and handwritten notes[7][8]

PDFs with embedded images (extracts visuals with text)[6][9]

Enterprise-Grade Performance

Processes 2,000 pages/minute per node - 10x faster than Azure OCR[2][7]

Batch processing doubles efficiency (2,000 pages/$)[2][5][10]

50MB file/1,000 page limits per API call[9]

AI Integration Features

Document-as-prompt functionality enables natural language queries about content[2][10]

Direct integration with RAG systems for multimodal document analysis[9][10]

LLM-powered summarization and contextual understanding[3][5][9]

Deployment Flexibility

Cloud API through "la Plateforme" developer suite[5][10]

Self-hosting options for sensitive data[2][8][10]

Coming soon to AWS/Azure/Google Cloud[7]

While Mistral OCR shows impressive baseline performance, real-world tests reveal challenges with financial tables (17% misalignment) and legal document checkboxes[8]. However, its combination of speed, multilingual support, and AI integration makes it particularly valuable for research institutions[4][10], legal firms[7], and enterprises managing large document repositories[2][5]. The models continue evolving, with Mistral actively expanding on-premises deployment options[10].

For developers, the Python API allows easy integration:
# Process PDF with Mistral OCR
response = client.ocr.process(
    document=DocumentURLChunk(document_url=signed_url.url),
    model="mistral-ocr-latest",
    include_image_base64=True
)
```[9]  

This combination of technical capabilities positions Mistral OCR as a leading solution for transforming unstructured documents into AI-ready formats at scale.

Citations:

- 1: https://mojoauth.com/blog/news-mistral-ocr/
- 2: https://www.maginative.com/article/mistral-unveils-new-ocr-api-with-advanced-document-understanding-capabilities/
- 3: https://parsio.io/blog/what-is-mistral-ocr/
- 4: https://www.turing.com/blog/exploring-mistral-ocr
- 5: https://pureai.com/Articles/2025/03/11/Mistral-AI-Launches-AIPowered-OCR.aspx
- 6: https://parsio.io/blog/mistral-ocr-test-review/
- 7: https://www.eweek.com/news/mistral-ai-new-ocr-api/
- 8: https://www.runpulse.com/blog/beyond-the-hype-real-world-tests-of-mistrals-ocr
- 9: https://www.infoq.com/news/2025/03/mistral-ai-ocr-api/
- 10: https://mistral.ai/news/mistral-ocr

---



</details>

**Model Capabilities Verification: No Changes Required**

Based on the official documentation for Mistral AI’s OCR models, the mistral-ocr-2503 model is purpose-built for high-accuracy document processing and text extraction rather than for chat, generative completions, or function calling. The capabilities shown in the code—`completion_chat`, `completion_fim`, `function_calling`, `fine_tuning`, `vision`, and `classification` all set to `false`—accurately reflect that this model does not support such generative or LLM-specific functionalities. The advanced OCR features (e.g., layout preservation, multilingual extraction) operate through a different integration and are not represented within this generic capabilities object.

No changes are needed; however, if OCR-specific properties are ever required for clarity within the system, that adjustment can be considered separately.

</details>
<details>
<summary>components/mistral_ai/actions/create-embeddings/create-embeddings.mjs (2)</summary>

`5-32`: **Action structure looks good and follows best practices.**

The implementation of the Create Embeddings action is well-structured, with clear descriptions, appropriate input handling, and proper export of summary information. Using constants for the model and utility functions for parsing is a good practice.

---

`23-23`: <details>
<summary>✅ Verification successful</summary>

**Verify EMBEDDINGS_MODEL constant definition.**

Ensure the EMBEDDINGS_MODEL constant is properly defined in the constants.mjs file.

---

🏁 Script executed:

```shell
#!/bin/bash
# Check if EMBEDDINGS_MODEL is defined in constants.mjs
grep -r "EMBEDDINGS_MODEL" components/mistral_ai/common/
Length of output: 204
EMBEDDINGS_MODEL Verified

The constant is correctly defined in components/mistral_ai/common/constants.mjs as follows:
const EMBEDDINGS_MODEL = "mistral-embed";
No further changes are required.
components/mistral_ai/actions/create-batch-job/create-batch-job.mjs (2)

5-61: Well-structured batch job creation action implementation.

The Create Batch Job action is well-implemented with appropriate props, input validation, and response handling. The documentation link and descriptive labels enhance usability.

29-29:
❓ Verification inconclusive

Verify BATCH_JOB_ENDPOINT_OPTIONS constant.

Ensure that the BATCH_JOB_ENDPOINT_OPTIONS constant is defined in the constants.mjs file with appropriate values.

🏁 Script executed:
#!/bin/bash
# Check if BATCH_JOB_ENDPOINT_OPTIONS is defined in constants.mjs
grep -r "BATCH_JOB_ENDPOINT_OPTIONS" components/mistral_ai/common/
Length of output: 219
Action: Verify the values within the BATCH_JOB_ENDPOINT_OPTIONS constant definition

The grep output confirms that the BATCH_JOB_ENDPOINT_OPTIONS constant is indeed defined in components/mistral_ai/common/constants.mjs as an array.

However, note that the array includes an element that repeats the constant name itself (BATCH_JOB_ENDPOINT_OPTIONS,). Please double-check if this self-reference is intentional or if it should be replaced with the appropriate endpoint configuration values.
components/mistral_ai/actions/list-models/list-models.mjs (1)

3-23: LGTM! Clean implementation of model listing action

The implementation follows best practices for Pipedream components, correctly handles the response, and provides a user-friendly summary message.

components/mistral_ai/sources/new-model-added/new-model-added.mjs (1)

4-29: LGTM! Well-structured event source implementation

The source is properly implemented, extending the common base component and providing clear methods for deduplication and metadata generation.
components/mistral_ai/mistral_ai.app.mjs (7)

27-40: Model ID retrieval looks good; consider similar error handling.

As with fileIds, the modelId property depends on listModels. Consider a similar approach to handle unexpected HTTP errors or empty data gracefully.

60-71: Listing models and batch jobs is straightforward.

These helper methods are concise. No immediate issues found with correctness or style.

72-77: File listing approach is aligned with existing patterns.

listFiles matches the structure of the other listing methods. Looks consistent.

78-84: Embedding creation logic is clear.

The createEmbeddings method follows the established pattern for making POST requests. The approach is consistent.

92-98: Batch job creation method is consistent with the API pattern.

No functional concerns. For better observability, you might track job creation responses or errors in logs.

1-3:
✅ Verification successful

Imports look valid, but consider verifying the constants path.

These imports appear correct. However, ensure that "./common/constants.mjs" is the right relative path and that this file is present and exportable. If there's any doubt, you can verify using a script.

🏁 Script executed:
#!/bin/bash
# Verify the existence of "common/constants.mjs" and check for its exports
fd 'constants.mjs' --exec cat
Length of output: 131070
Verification Complete: Constants Import Verified

The import in components/mistral_ai/mistral_ai.app.mjs is valid. Our script confirmed that the relative path ./common/constants.mjs correctly resolves to a file that exports the expected constants.

99-123:
❓ Verification inconclusive

Pagination logic is well-structured; confirm large datasets.

The paginate method correctly yields items until the maximum count or no more data is returned. If working with very large datasets, confirm that the memory usage and API rate limits are handled. Otherwise, this approach is good.

🏁 Script executed:
#!/bin/bash
# Check for usage of `paginate` across the codebase to ensure large dataset handling is tested.
rg -A 10 -B 5 'paginate\({'
Length of output: 115

I'll re-run the usage check with a corrected regex pattern to ensure we correctly capture any occurrences of the paginate function. Please run the following script:

This should output the relevant code references (if any) without the regex parsing error observed previously. Based on the results, we can then verify that large dataset handling is properly tested. If no usage is found or additional context is needed, manual verification may be required.

🏁 Script executed:
#!/bin/bash
# Revised search command for the usage of the paginate method using an updated regex pattern.
rg -A 10 -B 5 "paginate\\(\\{"
Length of output: 66244
Pagination logic in mistral_ai.app.mjs looks solid and is consistent with similar implementations across the repository. The asynchronous generator efficiently yields data items until the maximum count is reached or no more data is returned. Given that numerous modules (e.g., in BigML, Ortto, QuickBooks, etc.) use analogous paginate patterns, please ensure that for very large datasets you’ve manually verified that memory usage stays under control and that API rate limits are respected during load testing.

components/mistral_ai/actions/list-models/list-models.mjs

components/mistral_ai/actions/generate-text/generate-text.mjs

luancazarine

Hi @michelle0927, LGTM! Ready for QA!

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (8)

components/mistral_ai/actions/get-batch-job-details/get-batch-job-details.mjs (1)
19-22: Consider adding error handling for missing batch jobs

While the code handles successful responses well, it might be helpful to add specific error handling for cases where the batch job doesn't exist or other API errors occur.
 async run({ $ }) {
-    const response = await this.mistralAI.getBatchJob({
-      $,
-      jobId: this.batchJobId,
-    });
+    try {
+      const response = await this.mistralAI.getBatchJob({
+        $,
+        jobId: this.batchJobId,
+      });
+      
+      if (response?.id) {
+        $.export("$summary", `Successfully retrieved details for batch job with ID: ${this.batchJobId}`);
+      }
+      return response;
+    } catch (error) {
+      $.export("$summary", `Failed to retrieve batch job: ${error.message}`);
+      throw error;
+    }
components/mistral_ai/actions/upload-file/upload-file.mjs (1)
17-18: Consider adding file size validation

While the description mentions a 512MB file size limit, the code doesn't verify this constraint. Consider adding validation to provide a better user experience before the API rejects oversized files.
 async run({ $ }) {
   const filePath = this.filePath.startsWith("/tmp/")
     ? this.filePath
     : `/tmp/${this.filePath}`;

   if (!fs.existsSync(filePath)) {
     throw new ConfigurationError(`File \`${filePath}\` not found`);
   }
+  
+  // Check file size (512MB limit)
+  const stats = fs.statSync(filePath);
+  const fileSizeInMB = stats.size / (1024 * 1024);
+  if (fileSizeInMB > 512) {
+    throw new ConfigurationError(`File size exceeds the 512MB limit. Current size: ${fileSizeInMB.toFixed(2)}MB`);
+  }
components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (2)
41-41: Use asynchronous file operations for large files

Using writeFileSync can block the Node.js event loop, which might impact performance for large files. Consider using the asynchronous version instead.
-  fs.writeFileSync(filePath, buffer);
+  await fs.promises.writeFile(filePath, buffer);
7-7: Fix typo in action description

There's a small typo in the description: "direcory" should be "directory".
-  description: "Download a batch job results file to the /tmp direcory. [See the Documentation](https://docs.mistral.ai/api/#tag/files/operation/files_api_routes_download_file)",
+  description: "Download a batch job results file to the /tmp directory. [See the Documentation](https://docs.mistral.ai/api/#tag/files/operation/files_api_routes_download_file)",
components/mistral_ai/mistral_ai.app.mjs (4)
44-57: Consider returning label-value pairs for consistency.
Currently, batchJobId returns an array of IDs only, whereas fileIds and modelId return objects. If you prefer uniform usage, consider returning { value: id, label: id }.
 return data?.map(({ id }) => ({
-  id
+  value: id,
+  label: id
})) || [];
60-78: Add optional error-handling or retry logic in _makeRequest.
Consider wrapping requests or adding a configurable timeout/retry to handle intermittent network errors.

134-140: createBatchJob respects the new batch endpoint.
Due to free trial limitations, consider adding a clearer error message or fallback if the user cannot create large batch jobs.

141-165: paginate generator works but may need a data existence check.
If the API unexpectedly returns data as null or undefined, the for...of loop could throw an error. Consider adding a fallback to an empty array.
const { data } = await fn({ params });
-for (const item of data) {
+for (const item of (data || [])) {
  yield item;
  ...
}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between afa869a and be67973.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (5)

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (1 hunks)
components/mistral_ai/actions/get-batch-job-details/get-batch-job-details.mjs (1 hunks)
components/mistral_ai/actions/upload-file/upload-file.mjs (1 hunks)
components/mistral_ai/mistral_ai.app.mjs (1 hunks)
components/mistral_ai/package.json (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Verify TypeScript components
GitHub Check: pnpm publish
GitHub Check: Publish TypeScript components

🔇 Additional comments (18)

components/mistral_ai/package.json (2)

3-3: Version bump aligns with feature additions

The version increment from 0.0.1 to 0.1.0 follows semantic versioning principles, appropriately reflecting the addition of new features without breaking changes.

15-17: Dependencies look appropriate for new functionality

The added dependencies support the new features being introduced:

@pipedream/platform provides core Pipedream functionality

form-data is necessary for file upload operations with the Mistral AI API

These dependencies align well with the PR objectives.

components/mistral_ai/actions/get-batch-job-details/get-batch-job-details.mjs (1)

1-28: Well-structured action implementation

This action follows Pipedream's best practices with clear documentation, appropriate error handling through async/await, and helpful summary exports for the user interface.

components/mistral_ai/actions/upload-file/upload-file.mjs (3)

1-5: Appropriate imports for file handling

The imports are well-chosen for this action's requirements, particularly the use of FormData for multipart file uploads.

32-38: Good file path validation and error handling

The code appropriately normalizes the file path and verifies the file's existence before attempting to upload it.

40-45: Well-implemented FormData construction

The FormData object is properly constructed with the file content and optional purpose parameter.

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (1)

30-35: Appropriate API call with correct response type

The code correctly specifies responseType: "arraybuffer" for binary file download, which is essential for handling non-text responses.

components/mistral_ai/mistral_ai.app.mjs (11)

1-2: Imports look appropriate and consistent with project structure.
No major concerns. Ensure constants.mjs remains well-maintained and that the axios import from @pipedream/platform is suitably pinned to avoid version discrepancies.

7-28: Well-defined fileIds prop with dynamic options.
The asynchronous options method is straightforward and effectively utilizes this.listFiles(). Confirm that sampleType is always defined or gracefully handled if omitted.

Would you like me to scan the codebase to verify all calls to this property ensure sampleType is passed in?

30-43: Property modelId aligns with supported models.
The logic is clear, mapping each model to { value, label }. This property retrieval approach is consistent with typical Pipedream patterns.

79-90: listModels and listBatchJobs methods appear well-structured.
They clearly separate API endpoints and can be easily reused elsewhere.

91-96: listFiles method is straightforward.
No red flags or potential edge cases spotted.

97-104: getBatchJobDetails method is concise and direct.
Looks good for fetching job status.

105-112: downloadFile method is properly scoped.
No immediate concerns regarding correctness or security.

113-119: uploadFile method remains simple and coherent.
Useful for creating user-facing actions or for automated flows.

120-126: createEmbeddings aligns with typical embedding endpoint usage.
Straightforward approach, no issues observed.

127-133: sendPrompt is well-defined for text completions.
Make sure user input or prompts are sanitized if exposed externally.

Do you want me to check if there's code to handle user-supplied content properly?

167-167: Closing brace finalizes the definition.
File structure is successfully wrapped, no errors observed.

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (1)

37-39: Simplified buffer handling.

The buffer handling implementation follows the previous recommendation for better efficiency by checking if the response is already a Buffer before creating a new one.

🧹 Nitpick comments (2)

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (2)
44-49: Consider adding error handling for file operations.

While the API call errors would be caught by Pipedream's framework, it would be good to add explicit error handling for file system operations.
-    fs.writeFileSync(filePath, buffer);
+    try {
+      fs.writeFileSync(filePath, buffer);
+    } catch (error) {
+      $.export("error", `Failed to write file: ${error.message}`);
+      throw error;
+    }

     $.export("$summary", "Successfully downloaded batch results file");

     return [
-      filename,
+      this.filename, // Use the original filename property for consistency
       filePath,
     ];
50-51: Consider returning a structured object for clarity.

Instead of returning an array, a structured object might be more self-documenting and easier for consumers to use.
     return [
       filename,
       filePath,
     ];
+    // Alternative: return a structured object
+    // return {
+    //   filename,
+    //   filePath,
+    // };

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be67973 and 20e5090.

📒 Files selected for processing (1)

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: pnpm publish
GitHub Check: Verify TypeScript components
GitHub Check: Publish TypeScript components

🔇 Additional comments (4)

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs (4)

1-3: LGTM! Good imports.

The imports look clean and appropriate for this component. The file system module is correctly imported for file operations.

4-9: Component metadata is well-defined.

The component is properly structured with a clear key, descriptive name, and good documentation link. The version and type are correctly specified.

10-29: Props are well-defined with clear descriptions.

The component properties are well structured with appropriate labels and descriptions. The fileId prop correctly uses the propDefinition from the mistralAI app with the proper sample type.

30-36: API call looks correct.

The API call to download the file uses the correct parameters and properly specifies the responseType as "arraybuffer" for binary data.

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs

michelle0927 · 2025-04-09T14:21:47Z

/approve

michelle0927 added 2 commits April 2, 2025 16:59

new components

91d220f

pnpm-lock.yaml

dd3a641

coderabbitai bot reviewed Apr 2, 2025

View reviewed changes

fix description

afa869a

pipedream-component-development requested a review from luancazarine April 2, 2025 21:32

luancazarine previously approved these changes Apr 7, 2025

View reviewed changes

michelle0927 added 2 commits April 8, 2025 12:26

new actions

aa6d496

pnpm-lock.yaml

be67973

michelle0927 dismissed luancazarine’s stale review via be67973 April 8, 2025 16:29

coderabbitai bot reviewed Apr 8, 2025

View reviewed changes

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs Outdated Show resolved Hide resolved

updates

20e5090

coderabbitai bot reviewed Apr 8, 2025

View reviewed changes

components/mistral_ai/actions/download-batch-job-results/download-batch-job-results.mjs Show resolved Hide resolved

pipedream-component-development requested a review from luancazarine April 9, 2025 14:21

luancazarine approved these changes Apr 9, 2025

View reviewed changes

michelle0927 merged commit ad8ded4 into master Apr 9, 2025
11 checks passed

michelle0927 deleted the issue-16107 branch April 9, 2025 14:34

New Components - mistral_ai #16155

New Components - mistral_ai #16155

Uh oh!

Conversation

michelle0927 commented Apr 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

vercel bot commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Assessment against linked issues

Suggested labels

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

High Accuracy & Layout Preservation

Multilingual & Multimodal Processing

Enterprise-Grade Performance

AI Integration Features

Deployment Flexibility

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luancazarine left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michelle0927 commented Apr 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

michelle0927 commented Apr 2, 2025 •

edited by coderabbitai bot

Loading

vercel bot commented Apr 2, 2025 •

edited

Loading

coderabbitai bot commented Apr 2, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)