Skip to content
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 26 additions & 2 deletions docs/hub/spaces-zerogpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,15 @@ Unlike traditional single-GPU allocations, ZeroGPU's efficient system lowers bar

## Technical Specifications

- **GPU Type**: Nvidia H200 slice
- **Available VRAM**: 70GB per workload
ZeroGPU supports two GPU sizes

| GPU size | Backing hardware | Vram | Quota cost |
|---------------------|------------------|--------------------------|------------|
| `large` *(default)* | Half NVIDIA H200 | 70GB | 1× |
| `xlarge` | Full NVIDIA H200 | 141GB | 2× |

> [!NOTE]
> See [GPU size selection](#gpu-size-selection) to learn how to use sizes

## Compatibility

Expand Down Expand Up @@ -87,6 +94,23 @@ gr.Interface(

Note: The `@spaces.GPU` decorator is designed to be effect-free in non-ZeroGPU environments, ensuring compatibility across different setups.

## GPU size selection

The default size used by `@spaces.GPU` is `large` (half H200).

You can explicitly request a full H200 by specifying `size="xlarge"`:

``` python
@spaces.GPU(size="xlarge")
def generate(prompt):
return pipe(prompt).images
```

> [!NOTE]
> - `xlarge` consumes **2×** more daily quota than `large` (e.g. a 45s **effective** task duration consumes 90s of quota)
> - `xlarge` usually means increased queuing probability and duration
> - Only use `xlarge` when your workload truly benefits from the additional compute or memory

## Duration Management

For functions expected to exceed the default 60-second of GPU runtime, you can specify a custom duration:
Expand Down