Skip to content

Conversation

@philipithomas
Copy link
Member

@philipithomas philipithomas commented Feb 19, 2025

Description of changes

  • Improvements & Bug fixes
    • Updates axum to 0.8
  • New functionality
    • Adds /openapi.json route that renders an OpenAPI spec
    • Adds /docs/ route that renders Swagger
    • Adds code generation of OpenAPI spec with utoipa
    • Standardizes path params to be tenant (not tenant_id) and database (not database_name)

Here's a video demo of how it works:

OpenAPI.mp4

Notes:

  • The routes and basic functionality mirror current Python functionality (see: https://api.trychroma.com:8000/docs)
  • OpenAPI spec is substantially improved, including: typed errors, typed responses, security headers

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@github-actions
Copy link

Please tag your PR title with one of: [ENH | BUG | DOC | TST | BLD | PERF | TYP | CLN | CHORE]. See https://docs.trychroma.com/contributing#contributing-code-and-ideas

@philipithomas philipithomas changed the title Add OpenAPI spec to Rust server [ENH] Add OpenAPI spec to Rust server Feb 19, 2025
@philipithomas philipithomas marked this pull request as ready for review February 19, 2025 23:33
}

#[derive(Serialize)]
#[derive(Serialize, ToSchema)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it might make sense to gate these derivations behind a feature flag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to @codetheweb and we decided it was ok to do this as a follow-up task

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cut ticket

Comment on lines +276 to +277
get,
path = "/api/v2/healthcheck",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if it's possible for utoipa to infer this so we don't have to define the method and path twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. I just looked through the docs to confirm.

Comment on lines +297 to +300
responses(
(status = 200, description = "Success", body = HeartbeatResponse),
(status = 500, description = "Server error", body = ErrorResponse)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to define a common set of error types instead of re-defining for each route?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feedback - turns out there is. Working on implementing this now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into the IntoResponses, and it seems that we can't mix per-path response (like HeartBeatResponse) with the CommonResponses. So, it won't work for our use case

.route("/api/v2/tenants/:tenant_name", get(get_tenant))
.route("/api/v2/tenants/:tenant_id/databases", get(list_databases).post(create_database))
.route("/api/v2/tenants/:tenant_id/databases/:name", get(get_database).delete(delete_database))
.route("/api/v2/tenants/{tenant_name}", get(get_tenant))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC - why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Axum 0.8 changed the way path params are specified: https://github.com/tokio-rs/axum/releases/tag/axum-v0.8.0

breaking: Upgrade matchit to 0.8, changing the path parameter syntax from /:single and /*many to /{single} and /{*many}; the old syntax produces a panic to avoid silent change in behavior (tokio-rs/axum#2645)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah but I misnamed it - should keep as tenant_id. Will fix that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think it was always tenant_name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like you're right. I just renamed everything to standardize on tenant and database, which matches our existing OpenAPI spec.

The different variable names were causing issues with code generation, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should standardize on tenant and database tbh. It leads to a lot of bugs in confusion between name vs id for things like database. Wdyt? I guess maybe its fine since id is UUID and will fail to parse? But users get confused so the explicitness is nice.

IIRC @drewkim explicitly did it this way for clarity.

Copy link
Member Author

@philipithomas philipithomas Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing it on our client experience, which call the params just tenant and database:

client = chromadb.HttpClient(
  ssl=True,
  host='api.trychroma.com',
  tenant='00000000-0000-0000-0000-000000000000',
  database='foo',
  headers={
      'x-chroma-token': 'YOUR_API_KEY'
  }
)

I think database and tenant look clean in the client libraries, and I think it helps to maintain the same variable name full-stack. If we change the backend to refer to them as database_name and tenant_id, then somewhere becomes a boundary that converts them from tenant -> tenant_id. That can be the client library.

So the main question to answer, which we can work backwards from, is:
Are we happy with the client library dev ex (tenant instead of tenant_id)?

Copy link
Member Author

@philipithomas philipithomas Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✏️ Discussed offline - sticking with tenant and database

request_body = CreateTenantPayload,
responses(
(status = 200, description = "Tenant created successfully", body = CreateTenantResponse),
(status = 401, description = "Unauthorized", body = ErrorResponse),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we forced to manually specify the error type this way? It would be nice if it could somehow be inferred off the return type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on defining common error types, which should simplify this: https://docs.rs/utoipa-gen/latest/utoipa_gen/derive.IntoResponses.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - not sure what you have in mind but might talk to @Sicheng-Pan and @codetheweb the first as I know they thought a bit about the current error types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just posted about this here: #3815 (comment)

Summary is that the IntoResponses standard errors aren't compatible with our pattern. So, we'll stick with the standard ErrorResponse that the frontend already uses for now.

responses(
(status = 200, description = "Database retrieved successfully", body = GetDatabaseResponse),
(status = 401, description = "Unauthorized", body = ErrorResponse),
(status = 404, description = "Database not found", body = ErrorResponse),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish it were possible to infer this from the response type :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug in, and it seems that the way we're doing typed returns elsewhere are making it hard right now. I propose we add it as a follow-up task.

Copy link
Contributor

@Sicheng-Pan Sicheng-Pan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@philipithomas philipithomas merged commit d044496 into main Feb 21, 2025
148 checks passed
@philipithomas philipithomas deleted the philipithomas/utoipa branch February 21, 2025 18:52
Inventrohyder pushed a commit to Inventrohyder/chroma that referenced this pull request Aug 5, 2025
## Description of changes

 - Improvements & Bug fixes
   - Updates `axum` to `0.8`
 - New functionality
   - Adds `/openapi.json` route that renders an OpenAPI spec
   - Adds `/docs/` route that renders Swagger
- Adds code generation of OpenAPI spec with
[`utoipa`](https://github.com/juhaku/utoipa)
- Standardizes path params to be `tenant` (*not tenant_id*) and
`database` (*not database_name*)

Here's a video demo of how it works:


https://github.com/user-attachments/assets/e66d9fc0-8b8b-408f-8b3a-2e975646edf3

Notes:
* The routes and basic functionality mirror current Python functionality
(see: https://api.trychroma.com:8000/docs)
* OpenAPI spec is substantially improved, including: typed errors, typed
responses, security headers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants