|
1 | | -# MCTS OpenAI |
2 | | -Every incoming request is wrapped with a Monte Carlo Tree Search (MCTS) pipeline |
| 1 | +# MCTS OpenAI API Wrapper |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | +Monte Carlo Tree Search (MCTS) is a method that uses extra compute to explore different candidate responses before selecting a final answer. It works by building a tree of options and running multiple iterations. This is similar in concept to inference scaling, but here a model generates several output candidates, reitereates and picks the best one. Every incoming request is wrapped with a MCTS pipeline to iteratively refine language model outputs. |
| 6 | + |
| 7 | +## Overview |
| 8 | + |
| 9 | +This FastAPI server exposes two endpoints: |
| 10 | + |
| 11 | +| Method | Endpoint | Description | |
| 12 | +|--------|------------------------|-------------------------------------------------------------------------------| |
| 13 | +| POST | `/v1/chat/completions` | Accepts chat completion requests. The call is wrapped with an MCTS refinement | |
| 14 | +| GET | `/v1/models` | Proxies a request to the underlying LLM provider’s models endpoint | |
| 15 | + |
| 16 | +During a chat completion call, the server executes an MCTS pipeline that generates intermediate updates (including a Mermaid diagram and iteration details). All these intermediate responses are aggregated into a single `<details>` block, and the final answer is appended at the end, following a consistent and structured markdown template. |
| 17 | + |
| 18 | +## Getting Started |
| 19 | + |
| 20 | +### Prerequisites |
| 21 | + |
| 22 | +- Python 3.8+ |
| 23 | +- [Poetry](https://python-poetry.org) for dependency management |
| 24 | + |
| 25 | +### Setup |
| 26 | + |
| 27 | +1. **Clone the repository:** |
| 28 | + |
| 29 | + ```bash |
| 30 | + git clone https://github.com/bearlike/mcts-openai-api.git |
| 31 | + cd mcts-openai-api |
| 32 | + ``` |
| 33 | + |
| 34 | +2. **Copy the Environment File:** |
| 35 | + |
| 36 | + Copy the example environment file to `.env` and set your `OPENAI_API_KEY`: |
| 37 | + |
| 38 | + ```bash |
| 39 | + cp .env.example .env |
| 40 | + ``` |
| 41 | + |
| 42 | + Open the `.env` file and update the `OPENAI_API_KEY` (and other settings if needed). |
| 43 | + |
| 44 | +3. **Install Dependencies:** |
| 45 | + |
| 46 | + Use Poetry to install the required packages: |
| 47 | + |
| 48 | + ```bash |
| 49 | + poetry install |
| 50 | + ``` |
| 51 | + |
| 52 | +4. **Run the Server:** |
| 53 | + |
| 54 | + Start the FastAPI server with Uvicorn: |
| 55 | + |
| 56 | + ```bash |
| 57 | + # Visit http://server-ip:8000/docs to view the Swagger API documentation |
| 58 | + uvicorn main:app --reload |
| 59 | + ``` |
| 60 | + |
| 61 | +## Testing the Server |
| 62 | + |
| 63 | +You can test the server using `curl` or any HTTP client. |
| 64 | + |
| 65 | +### Example Request |
| 66 | + |
| 67 | +```bash |
| 68 | +curl -X 'POST' \ |
| 69 | + 'http://192.168.1.198:8000/v1/chat/completions' \ |
| 70 | + -H 'accept: application/json' \ |
| 71 | + -H 'Content-Type: application/json' \ |
| 72 | + -d '{ |
| 73 | + "model": "gpt-4o-mini", |
| 74 | + "messages": [ |
| 75 | + { |
| 76 | + "role": "user", |
| 77 | + "content": "How many R in STRAWBERRY?" |
| 78 | + } |
| 79 | + ], |
| 80 | + "max_tokens": 1024, |
| 81 | + "temperature": 0.5 |
| 82 | +}' | jq -r '.choices[0].message.content' |
| 83 | +``` |
| 84 | + |
| 85 | +This request will return a JSON response with the aggregated intermediate responses wrapped inside a single `<details>` block, followed by the final answer. |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## Endpoints |
| 90 | + |
| 91 | +### POST /v1/chat/completions |
| 92 | + |
| 93 | +- **Description:** |
| 94 | + Wraps a chat completion request in an MCTS pipeline that refines the answer by generating intermediate updates and a final response. |
| 95 | + |
| 96 | +- **Request Body Parameters:** |
| 97 | + |
| 98 | + - `model`: string (e.g., `"gpt-4o-mini"`) |
| 99 | + - `messages`: an array of chat messages (with `role` and `content` properties) |
| 100 | + - `max_tokens`: (optional) number |
| 101 | + - `temperature`: (optional) number |
| 102 | + - `stream`: (optional) boolean (if enabled, aggregates intermediate responses with the final answer in one JSON response) |
| 103 | + |
| 104 | +### GET /v1/models |
| 105 | + |
| 106 | +- **Description:** |
| 107 | + Proxies requests to list available models from the underlying LLM provider using the `OPENAI_API_BASE_URL`. |
| 108 | + |
| 109 | +## License |
| 110 | + |
| 111 | +This project is licensed under the MIT License. |
0 commit comments