`gradio.App` — Low-Level FastAPI Entrypoint

gradio.App is a lower-level entrypoint for Gradio that gives you direct access to the underlying FastAPI application. Use it when you need full control: add custom routes that return pages entirely in HTML/JS/CSS, define REST endpoints with Pydantic validation and dependency injection, or mix standard web routes with Gradio's backend features like GPU management, request queuing, batching, and streaming.

Quickstart

from gradio import App

app = App()

@app.gpu()
@app.api(name="generate", concurrency_limit=2)
async def generate(prompt: str):
    for token in model.generate(prompt):
        yield token  # streams via SSE

@app.get("/")
async def root(prompt: str):
    tokens = [token async for token in generate(prompt)]
    return {"message": tokens}

app.launch()

Since App extends FastAPI, everything you know works: @app.get(), @app.post(), path params, query params, Depends(), APIRouter, Pydantic models, /docs, /openapi.json.

Custom HTML Routes

Serve full HTML pages alongside your Gradio backend:

from gradio import App
from fastapi.responses import HTMLResponse

app = App()

@app.get("/", response_class=HTMLResponse)
async def homepage():
    return """
    <html>
      <head><script src="/static/app.js"></script></head>
      <body>
        <h1>My App</h1>
        <div id="root"></div>
      </body>
    </html>
    """

@app.gpu()
@app.api(name="predict", concurrency_limit=2)
def predict(text: str):
    return model(text)

app.launch()

Your HTML/JS frontend can call the generated /api/predict endpoint directly, giving you full control over the UI while leveraging Gradio's backend for GPU management and queuing.

GPU Batching

@app.gpu(batch_size=8, batch_timeout=0.05)
@app.api(name="predict")
def predict(images: list[bytes]) -> list[str]:
    return model(images)  # called once with up to 8 inputs

Multi-GPU

@app.gpu(device=0)
@app.api(name="model_a")
def model_a(text: str):
    ...

@app.gpu(device=1)
@app.api(name="model_b")
def model_b(text: str):
    ...

Request Queue

All decorated functions go through the queue with position tracking and ETA:

@app.gpu()
@app.api(name="predict", concurrency_limit=2)
def predict(text: str):
    return model(text)

Clients interact with the queue via two endpoints:

# 1. Join the queue
POST /queue/join  {"endpoint": "predict", "data": {"text": "hello"}}
# Returns: {"event_id": "abc123"}

# 2. Listen for updates via SSE
GET /queue/data?event_id=abc123
# Stream of events:
#   {"msg": "estimation", "rank": 2, "queue_size": 5, "rank_eta": 3.4}
#   {"msg": "process_starts", "eta": 1.2}
#   {"msg": "process_completed", "output": {"data": "result"}, "success": true}

The direct endpoint (POST /api/predict) still works for non-queued access.

Mounting Gradio UIs

Mount a Gradio app alongside your custom routes:

import gradio as gr
from gradio import App

app = App()

@app.gpu()
@app.api(name="predict", concurrency_limit=2)
def predict(text: str):
    return model(text)

demo = gr.Interface(predict, "text", "text")
gr.mount_gradio_app(app, demo, path="/demo")

app.launch()

MCP Server

Mark functions as MCP tools with @app.mcp() so AI agents and LLM clients can discover and call them:

from gradio import App

app = App()

@app.gpu()
@app.mcp(name="generate")
async def generate(prompt: str):
    result = model.generate(prompt)
    return result

@app.gpu()
@app.mcp(name="summarize")
def summarize(text: str) -> str:
    return model.summarize(text)

app.launch()

Each @app.mcp() function becomes an MCP tool. The function name, parameters, and type hints are used to generate the tool schema automatically.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
fastgradio		fastgradio
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`gradio.App` — Low-Level FastAPI Entrypoint

Quickstart

Custom HTML Routes

GPU Batching

Multi-GPU

Request Queue

Mounting Gradio UIs

MCP Server

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gradio.App — Low-Level FastAPI Entrypoint

Quickstart

Custom HTML Routes

GPU Batching

Multi-GPU

Request Queue

Mounting Gradio UIs

MCP Server

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`gradio.App` — Low-Level FastAPI Entrypoint

Packages