Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# 2.1.0
- Added Mistral OCR. See the README.md for more details.

# 2.0.0

- **BREAKING**: Major refactor of message handling for chat completions:
Expand Down
72 changes: 68 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

**Mistral-java-client** is a Java client for the [Mistral AI](https://mistral.ai/) API. It allows you to easily interact
with the Mistral AI API from your Java application.
Supports all chat completion and embedding models available in the API.
Supports all chat completion, OCR, and embedding models available in the API.

New models or models not listed here may be already supported without any updates to the library.

Expand All @@ -15,6 +15,7 @@ Mistral-java-client is built against version 0.0.2 of the [Mistral AI API](https
- [Chat Completion](https://docs.mistral.ai/api/#tag/chat/operation/chat_completion_v1_chat_completions_post)
- [List Models](https://docs.mistral.ai/api/#tag/models/operation/list_models_v1_models_get)
- [Embeddings](https://docs.mistral.ai/api/#tag/embeddings/operation/embeddings_v1_embeddings_post)
- [OCR](https://docs.mistral.ai/api/#tag/ocr)

# Requirements

Expand All @@ -33,7 +34,7 @@ repositories {
}

dependencies {
implementation 'com.github.Dannyj1:mistral-java-client:2.0.0'
implementation 'com.github.Dannyj1:mistral-java-client:2.1.0'
}
```

Expand All @@ -51,7 +52,7 @@ dependencies {
<dependency>
<groupId>com.github.Dannyj1</groupId>
<artifactId>mistral-java-client</artifactId>
<version>2.0.0</version>
<version>2.1.0</version>
</dependency>
```

Expand Down Expand Up @@ -397,11 +398,74 @@ Example output:
[-0.02015686, 0.04272461, 0.05529785, ... , -0.006855011, 0.009529114, -0.016448975]
```


## OCR Completion

This example shows how to use the Mistral AI API to perform OCR on a document.

```java
import nl.dannyj.mistral.MistralClient;
import nl.dannyj.mistral.models.completion.content.DocumentURLChunk;
import nl.dannyj.mistral.models.ocr.OCRRequest;
import nl.dannyj.mistral.models.ocr.OCRResponse;
import nl.dannyj.mistral.models.ocr.OCRPageObject;

import java.net.URI;

public class MinimalOcrExample {

public static void main(String[] args) {
// Replace "C:\\path\\to\\file.pdf" with the actual path to your document
String filePath = "C:\\\\path\\\\to\\\\file.pdf";
File documentFile = new File(filePath);

// Replace "YOUR_API_KEY" with your actual Mistral AI API key
// Or set the MISTRAL_API_KEY environment variable
MistralClient client = new MistralClient("YOUR_API_KEY");

// Convert document to base64
byte[] documentBytes = Files.readAllBytes(documentFile.toPath());
String documentBase64 = Base64.getEncoder().encodeToString(documentBytes);
URI documentUrl = URI.create("data:application/pdf;base64," + documentBase64);

DocumentURLChunk documentChunk = DocumentURLChunk.builder()
.documentUrl(documentUrl)
.documentName("your_document.pdf") // Replace with your document name
.build();

OCRRequest request = OCRRequest.builder()
.model("mistral-ocr-latest") // Or another supported OCR model
.document(documentChunk)
.build();

try {
System.out.println("Performing OCR...");
OCRResponse response = client.createOcrCompletion(request);

System.out.println("OCR Results:");
if (response.getPages() != null && !response.getPages().isEmpty()) {
// Print markdown content of the first page
OCRPageObject firstPage = response.getPages().get(0);
System.out.println("--- Page " + firstPage.getIndex() + " ---");
System.out.println("Markdown Content:");
System.out.println(firstPage.getMarkdown());
} else {
System.out.println("No pages processed or results found.");
}

} catch (Exception e) {
System.err.println("An error occurred: " + e.getMessage());
e.printStackTrace();
}
}
}
```

# Roadmap

- [ ] Make multi-modal usage more convenient (through builders, etc.)
- [ ] Make JSON schemas for function calling more developer-friendly
- [ ] Add support for all missing features (e.g. OCR)
- [ ] Add support for all missing features (e.g. Codestral)
- [ ] Handle rate limits
- [ ] Unit tests

Expand Down
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ plugins {
}

group = "nl.dannyj"
version = "2.0.0"
version = "2.1.0"

repositories {
mavenCentral()
Expand Down
33 changes: 31 additions & 2 deletions src/main/java/nl/dannyj/mistral/MistralClient.java
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,14 @@
import nl.dannyj.mistral.models.embedding.EmbeddingRequest;
import nl.dannyj.mistral.models.embedding.EmbeddingResponse;
import nl.dannyj.mistral.models.model.ListModelsResponse;
import nl.dannyj.mistral.models.ocr.OCRRequest;
import nl.dannyj.mistral.models.ocr.OCRResponse;
import nl.dannyj.mistral.net.ChatCompletionChunkCallback;
import nl.dannyj.mistral.services.HttpService;
import nl.dannyj.mistral.services.MistralService;
import okhttp3.OkHttpClient;

import java.util.Objects;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;

Expand Down Expand Up @@ -72,7 +75,7 @@ public MistralClient(@NonNull String apiKey) {
* Default constructor that initializes the MistralClient with the API key from the environment variable "MISTRAL_API_KEY".
*/
public MistralClient() {
this.apiKey = System.getenv(API_KEY_ENV_VAR);
this.apiKey = Objects.requireNonNull(System.getenv(API_KEY_ENV_VAR), "API key not found in environment variable " + API_KEY_ENV_VAR);
this.httpClient = buildHttpClient(120, 10, 10);
this.objectMapper = buildObjectMapper();
this.mistralService = buildMistralService();
Expand Down Expand Up @@ -134,7 +137,7 @@ public MistralClient(@NonNull String apiKey, int readTimeoutSeconds, int connect
}

/**
* Default constructor that initializes the MistralClient with the API key from the environment variable "MISTRAL_API_KEY" and custom timeouts.
* Default constructor that initializes the MistralClient with the API key from the environment variable "MISTRAL_API_KEY".
*
* @param readTimeoutSeconds The read timeout in seconds
* @param connectTimeoutSeconds The connect timeout in seconds
Expand Down Expand Up @@ -241,6 +244,32 @@ public CompletableFuture<ListModelsResponse> listModelsAsync() {
return mistralService.listModelsAsync();
}

/**
* Use the Mistral AI API to perform OCR on a document.
* This is a blocking method.
*
* @param request The request to perform OCR. See {@link OCRRequest}.
* @return The response from the Mistral AI API containing the OCR results. See {@link OCRResponse}.
* @throws ConstraintViolationException if the request does not pass validation
* @throws UnexpectedResponseException if an unexpected response is received from the Mistral AI API
*/
public OCRResponse performOcr(@NonNull OCRRequest request) {
return mistralService.performOcr(request);
}

/**
* Use the Mistral AI API to perform OCR on a document.
* This is a non-blocking/asynchronous method.
*
* @param request The request to perform OCR. See {@link OCRRequest}.
* @return A CompletableFuture that will complete with the OCR results from the Mistral AI API. See {@link OCRResponse}.
* @throws ConstraintViolationException if the request does not pass validation
* @throws UnexpectedResponseException if an unexpected response is received from the Mistral AI API
*/
public CompletableFuture<OCRResponse> performOcrAsync(@NonNull OCRRequest request) {
return mistralService.performOcrAsync(request);
}

public void createChatCompletionStream(@NonNull ChatCompletionRequest request, @NonNull ChatCompletionChunkCallback callback) {
mistralService.createChatCompletionStream(request, callback);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import jakarta.annotation.Nullable;
import jakarta.validation.constraints.NotNull;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Getter;
import lombok.NoArgsConstructor;

Expand All @@ -31,6 +32,7 @@
*/
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class DocumentURLChunk implements ContentChunk {

/**
Expand Down
77 changes: 77 additions & 0 deletions src/main/java/nl/dannyj/mistral/models/ocr/OCRImageObject.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
package nl.dannyj.mistral.models.ocr;

import com.fasterxml.jackson.annotation.JsonProperty;
import jakarta.annotation.Nullable;
import jakarta.validation.constraints.NotNull;
import jakarta.validation.constraints.PositiveOrZero;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

/**
* Represents an extracted image object within an OCR page.
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class OCRImageObject {

/**
* Image ID for extracted image in a page.
*
* @return The image ID.
*/
@NotNull
private String id;

/**
* X coordinate of top-left corner of the extracted image.
*
* @return The top-left X coordinate.
*/
@NotNull
@PositiveOrZero
@JsonProperty("top_left_x")
private Integer topLeftX;

/**
* Y coordinate of top-left corner of the extracted image.
*
* @return The top-left Y coordinate.
*/
@NotNull
@PositiveOrZero
@JsonProperty("top_left_y")
private Integer topLeftY;

/**
* X coordinate of bottom-right corner of the extracted image.
*
* @return The bottom-right X coordinate.
*/
@NotNull
@PositiveOrZero
@JsonProperty("bottom_right_x")
private Integer bottomRightX;

/**
* Y coordinate of bottom-right corner of the extracted image.
*
* @return The bottom-right Y coordinate.
*/
@NotNull
@PositiveOrZero
@JsonProperty("bottom_right_y")
private Integer bottomRightY;

/**
* Base64 string of the extracted image.
*
* @return The Base64 image string.
*/
@Nullable
@JsonProperty("image_base64")
private String imageBase64;
}
45 changes: 45 additions & 0 deletions src/main/java/nl/dannyj/mistral/models/ocr/OCRPageDimensions.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
package nl.dannyj.mistral.models.ocr;

import jakarta.validation.constraints.NotNull;
import jakarta.validation.constraints.PositiveOrZero;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

/**
* Represents the dimensions of a PDF page's screenshot image.
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class OCRPageDimensions {

/**
* Dots per inch of the page-image.
*
* @return The DPI of the page image.
*/
@NotNull
@PositiveOrZero
private Integer dpi;

/**
* Height of the image in pixels.
*
* @return The height of the image.
*/
@NotNull
@PositiveOrZero
private Integer height;

/**
* Width of the image in pixels.
*
* @return The width of the image.
*/
@NotNull
@PositiveOrZero
private Integer width;
}
53 changes: 53 additions & 0 deletions src/main/java/nl/dannyj/mistral/models/ocr/OCRPageObject.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
package nl.dannyj.mistral.models.ocr;

import jakarta.validation.constraints.NotNull;
import jakarta.validation.constraints.PositiveOrZero;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

import java.util.List;

/**
* Represents the OCR information for a single page.
*/
@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class OCRPageObject {

/**
* The page index in a PDF document starting from 0.
*
* @return The page index.
*/
@NotNull
@PositiveOrZero
private Integer index;

/**
* The markdown string response of the page.
*
* @return The markdown string response.
*/
@NotNull
private String markdown;

/**
* List of all extracted images in the page.
*
* @return The list of extracted images.
*/
@NotNull
private List<OCRImageObject> images;

/**
* The dimensions of the PDF Page's screenshot image.
*
* @return The dimensions of the page.
*/
@NotNull
private OCRPageDimensions dimensions;
}
Loading