Updated server docs

KillianLucas · KillianLucas · commit 098e4da92538 · 2024-07-19T16:17:01.000-07:00
diff --git a/docs/server/usage.mdx b/docs/server/usage.mdx
@@ -1,6 +1,4 @@
----
-title: Server Usage
----
+# Server Usage Guide
 
 ## Starting the Server
 
@@ -27,35 +25,45 @@ async_interpreter.server.run(port=8000)  # Default port is 8000, but you can cus
 Connect to the WebSocket server at `ws://localhost:8000/`.
 
 ### Message Format
-Messages must follow the LMC format with start and end flags. For detailed specifications, see the [LMC messages documentation](https://docs.openinterpreter.com/protocols/lmc-messages).
+The server uses an extended message format that allows for rich, multi-part messages. Here's the basic structure:
 
-Basic message structure:
 ```json
-{"role": "user", "type": "message", "start": true}
+{"role": "user", "start": true}
 {"role": "user", "type": "message", "content": "Your message here"}
-{"role": "user", "type": "message", "end": true}
+{"role": "user", "end": true}
 ```
 
+### Multi-part Messages
+You can send complex messages with multiple components:
+
+1. Start with `{"role": "user", "start": true}`
+2. Add various types of content (message, file, image, etc.)
+3. End with `{"role": "user", "end": true}`
+
+### Content Types
+You can include various types of content in your messages:
+
+- Text messages: `{"role": "user", "type": "message", "content": "Your text here"}`
+- File paths: `{"role": "user", "type": "file", "content": "path/to/file"}`
+- Images: `{"role": "user", "type": "image", "format": "path", "content": "path/to/photo"}`
+- Audio: `{"role": "user", "type": "audio", "format": "wav", "content": "path/to/audio.wav"}`
+
 ### Control Commands
 To control the server's behavior, send the following commands:
 
 1. Stop execution:
    ```json
-   {"role": "user", "type": "command", "start": True},
    {"role": "user", "type": "command", "content": "stop"}
-   {"role": "user", "type": "command", "end": True}
    ```
    This stops all execution and message processing.
 
 2. Execute code block:
    ```json
-   {"role": "user", "type": "command", "start": True},
    {"role": "user", "type": "command", "content": "go"}
-   {"role": "user", "type": "command", "end": True}
    ```
    This executes a generated code block and allows the agent to proceed.
 
-   **Important**: If `auto_run` is set to `False`, the agent will pause after generating code blocks. You must send the "go" command to continue execution.
+   **Note**: If `auto_run` is set to `False`, the agent will pause after generating code blocks. You must send the "go" command to continue execution.
 
 ### Completion Status
 The server indicates completion with the following message:
@@ -71,8 +79,46 @@ If an error occurs, the server will send an error message in the following forma
 ```
 Your client should be prepared to handle these error messages appropriately.
 
-### Example WebSocket Interaction
-Here's a simple example demonstrating the WebSocket interaction:
+## Code Execution Review
+
+After code blocks are executed, you'll receive a review message:
+
+```json
+{
+  "role": "assistant",
+  "type": "review",
+  "content": "Review of the executed code, including safety assessment and potential irreversible actions."
+}
+```
+
+This review provides important information about the safety and potential impact of the executed code. Pay close attention to these messages, especially when dealing with operations that might have significant effects on your system.
+
+The `content` field of the review message may have two possible formats:
+
+1. If the code is deemed completely safe, the content will be exactly `"<SAFE>"`.
+2. Otherwise, it will contain an explanation of why the code might be unsafe or have irreversible effects.
+
+Example of a safe code review:
+```json
+{
+  "role": "assistant",
+  "type": "review",
+  "content": "<SAFE>"
+}
+```
+
+Example of a potentially unsafe code review:
+```json
+{
+  "role": "assistant",
+  "type": "review",
+  "content": "This code performs file deletion operations which are irreversible. Please review carefully before proceeding."
+}
+```
+
+## Example WebSocket Interaction
+
+Here's an example demonstrating the WebSocket interaction:
 
 ```python
 import websockets
@@ -81,21 +127,25 @@ import asyncio
 
 async def websocket_interaction():
     async with websockets.connect("ws://localhost:8000/") as websocket:
-        # Send a message
-        await websocket.send(json.dumps({"role": "user", "type": "message", "start": True}))
-        await websocket.send(json.dumps({"role": "user", "type": "message", "content": "What's 2 + 2?"}))
-        await websocket.send(json.dumps({"role": "user", "type": "message", "end": True}))
+        # Send a multi-part user message
+        await websocket.send(json.dumps({"role": "user", "start": True}))
+        await websocket.send(json.dumps({"role": "user", "type": "message", "content": "Analyze this image:"}))
+        await websocket.send(json.dumps({"role": "user", "type": "image", "format": "path", "content": "path/to/image.jpg"}))
+        await websocket.send(json.dumps({"role": "user", "end": True}))
 
         # Receive and process messages
         while True:
             message = await websocket.recv()
             data = json.loads(message)
             
             if data.get("type") == "message":
-                print(data.get("content", ""), end="", flush=True)
+                print(f"Assistant: {data.get('content', '')}")
+            elif data.get("type") == "review":
+                print(f"Code Review: {data.get('content')}")
             elif data.get("type") == "error":
                 print(f"Error: {data.get('content')}")
             elif data == {"role": "assistant", "type": "status", "content": "complete"}:
+                print("Interaction complete")
                 break
 
 asyncio.run(websocket_interaction())
@@ -104,7 +154,7 @@ asyncio.run(websocket_interaction())
 ## HTTP API
 
 ### Modifying Settings
-To change server settings, send a POST request to `http://localhost:8000/settings`. The payload should conform to [the interpreter object's settings](https://docs.openinterpreter.com/settings/all-settings).
+To change server settings, send a POST request to `http://localhost:8000/settings`. The payload should conform to the interpreter object's settings.
 
 Example:
 ```python
@@ -126,9 +176,56 @@ Example:
 ```python
 response = requests.get("http://localhost:8000/settings/custom_instructions")
 print(response.json())
-# Output: {"custom_instructions": "You only write react."}
+# Output: {"custom_instructions": "You only write Python code."}
+```
+
+## OpenAI-Compatible Endpoint
+
+The server provides an OpenAI-compatible endpoint at `/openai`. This allows you to use the server with any tool or library that's designed to work with the OpenAI API.
+
+### Chat Completions Endpoint
+
+The chat completions endpoint is available at:
+
+```
+[server_url]/openai/chat/completions
+```
+
+To use this endpoint, set the `api_base` in your OpenAI client or configuration to `[server_url]/openai`. For example:
+
+```python
+import openai
+
+openai.api_base = "http://localhost:8000/openai"  # Replace with your server URL if different
+openai.api_key = "dummy"  # The key is not used but required by the OpenAI library
+
+response = openai.ChatCompletion.create(
+    model="gpt-3.5-turbo",  # This model name is ignored, but required
+    messages=[
+        {"role": "system", "content": "You are a helpful assistant."},
+        {"role": "user", "content": "What's the capital of France?"}
+    ]
+)
+
+print(response.choices[0].message['content'])
 ```
 
+Note that only the chat completions endpoint (`/chat/completions`) is implemented. Other OpenAI API endpoints are not available.
+
+When using this endpoint:
+- The `model` parameter is required but ignored.
+- The `api_key` is required by the OpenAI library but not used by the server.
+
+## Best Practices
+
+1. Always handle the "complete" status message to ensure your client knows when the server has finished processing.
+2. If `auto_run` is set to `False`, remember to send the "go" command to execute code blocks and continue the interaction.
+3. Implement proper error handling in your client to manage potential connection issues, unexpected server responses, or server-sent error messages.
+4. Use the AsyncInterpreter class when working with the server in Python to ensure compatibility with asynchronous operations.
+5. Pay attention to the code execution review messages for important safety and operational information.
+6. Utilize the multi-part user message structure for complex inputs, including file paths and images.
+7. When sending file paths or image paths, ensure they are accessible to the server.
+
 ## Advanced Usage: Accessing the FastAPI App Directly
 
 The FastAPI app is exposed at `async_interpreter.server.app`. This allows you to add custom routes or host the app using Uvicorn directly.
@@ -151,25 +248,4 @@ if __name__ == "__main__":
     uvicorn.run(app, host="0.0.0.0", port=8000)
 ```
 
-## Using Docker
-
-You can also run the server using Docker. First, build the Docker image from the root of the repository:
-
-```bash
-docker build -t open-interpreter .
-```
-
-Then, run the container:
-
-```bash
-docker run -p 8000:8000 open-interpreter
-```
-
-This will expose the server on port 8000 of your host machine.
-
-## Best Practices
-1. Always handle the "complete" status message to ensure your client knows when the server has finished processing.
-2. If `auto_run` is set to `False`, remember to send the "go" command to execute code blocks and continue the interaction.
-3. Implement proper error handling in your client to manage potential connection issues, unexpected server responses, or server-sent error messages.
-4. Use the AsyncInterpreter class when working with the server in Python to ensure compatibility with asynchronous operations.
-5. When deploying in production, consider using the Docker container for easier setup and consistent environment across different machines.
+This guide covers all aspects of using the server, including the WebSocket API, HTTP API, OpenAI-compatible endpoint, code execution review, and various features. It provides clear explanations and examples for users to understand how to interact with the server effectively.