Skip to content

Conversation

@imgbot
Copy link

@imgbot imgbot bot commented Nov 19, 2025

Beep boop. Your images are optimized!

Your image file size has been reduced by 18% 🎉

Details
File Before After Percent reduction
/assets/figures/semantic-router/signal-5.png 392.85kb 98.53kb 74.92%
/assets/figures/semantic-router/signal-2.png 438.48kb 111.89kb 74.48%
/assets/figures/semantic-router/signal-7.png 190.01kb 51.13kb 73.09%
/assets/figures/semantic-router/signal-1.png 136.21kb 38.90kb 71.44%
/assets/figures/semantic-router/signal-6.png 172.88kb 50.59kb 70.74%
/assets/figures/semantic-router/signal-3.png 129.68kb 40.34kb 68.89%
/assets/figures/annimation2.gif 183.06kb 63.14kb 65.51%
/assets/figures/minimax-m1/lightning_attention.png 137.93kb 52.24kb 62.12%
/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png 155.92kb 59.38kb 61.92%
/assets/figures/2025-torch-compile/figure3.png 158.74kb 61.10kb 61.51%
/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png 58.85kb 23.65kb 59.82%
/assets/figures/2025-vllm-anatomy/server_setup.png 41.53kb 16.85kb 59.42%
/assets/figures/2025-torch-compile/figure4.png 120.94kb 50.23kb 58.46%
/assets/figures/2025-torch-compile/figure2.png 104.33kb 43.59kb 58.22%
/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png 238.96kb 103.63kb 56.63%
/assets/figures/2025-torch-compile/figure7.png 35.34kb 15.36kb 56.53%
/assets/figures/2025-torch-compile/figure6.png 177.58kb 77.74kb 56.22%
/assets/figures/semantic-router/signal-4.png 191.29kb 87.34kb 54.34%
/assets/logos/vllm-logo-only-light.png 52.94kb 24.32kb 54.05%
/assets/figures/2025-torch-compile/figure8.png 16.32kb 7.77kb 52.40%
/assets/figures/semantic-router/signal-8.png 263.82kb 127.87kb 51.53%
/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png 366.50kb 180.69kb 50.70%
/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png 366.67kb 181.65kb 50.46%
/assets/figures/spec-decode/figure2.png 198.98kb 101.82kb 48.83%
/assets/figures/2025-shm-ipc-cache/processes2.png 160.40kb 82.54kb 48.55%
/assets/figures/struct-decode-intro/shogoth-gpt.png 53.86kb 27.74kb 48.51%
/assets/figures/2025-shm-ipc-cache/processes1.png 144.34kb 74.98kb 48.05%
/assets/figures/llama4/perf.png 514.08kb 268.36kb 47.80%
/assets/logos/vllm-logo-text-light.png 87.87kb 45.92kb 47.74%
/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png 47.28kb 24.75kb 47.65%
/assets/figures/perf-v060/illustration-async-output-processing.png 189.81kb 99.47kb 47.60%
/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png 43.75kb 23.12kb 47.15%
/assets/figures/spec-decode/figure3.png 147.58kb 78.34kb 46.92%
/assets/figures/annimation3.gif 449.13kb 240.12kb 46.54%
/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png 209.66kb 112.70kb 46.25%
/assets/logos/vllm-logo-text-dark.png 86.27kb 46.79kb 45.76%
/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png 247.46kb 134.76kb 45.54%
/assets/figures/perf-v060/illustration-multi-step.png 216.43kb 118.06kb 45.45%
/assets/figures/deepseek-v3-2/dsa-explained.png 1,178.67kb 663.26kb 43.73%
/assets/figures/distributed-inference/kv_cache_effects.png 85.85kb 48.72kb 43.25%
/assets/figures/perf-v060/illustration-api-server.png 194.16kb 112.27kb 42.18%
/assets/figures/v1/v1_tp_architecture.png 172.54kb 101.53kb 41.16%
/assets/figures/v1/v1_scheduling.png 133.57kb 79.02kb 40.84%
/assets/figures/spec-decode/figure8.png 168.19kb 100.25kb 40.39%
/assets/figures/2025-vllm-anatomy/fsm.png 137.98kb 84.14kb 39.02%
/assets/figures/beyond-text/models-diff.png 381.34kb 239.89kb 37.09%
/assets/figures/annimation1.gif 342.45kb 216.73kb 36.71%
/assets/figures/v1/vLLM_V1_Logo.png 37.17kb 23.76kb 36.09%
/assets/figures/lmsys_traffic.png 196.11kb 125.48kb 36.01%
/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png 634.28kb 410.35kb 35.30%
/assets/figures/spec-decode/figure10.png 77.45kb 50.38kb 34.96%
/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png 667.21kb 437.30kb 34.46%
/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png 577.02kb 378.58kb 34.39%
/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png 66.28kb 43.66kb 34.12%
/assets/figures/vllm-tpu/vllm-tpu.png 189.11kb 125.13kb 33.83%
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png 31.83kb 21.10kb 33.71%
/assets/figures/v1/persistent_batch.png 78.32kb 52.14kb 33.43%
/assets/figures/2025-vllm-anatomy/prefix_pt2.png 182.58kb 122.45kb 32.93%
/assets/figures/2025-vllm-anatomy/engine_constructor.png 185.46kb 124.41kb 32.92%
/assets/figures/2025-vllm-anatomy/chunked_pt1.png 205.58kb 138.92kb 32.43%
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png 35.06kb 23.71kb 32.36%
/assets/figures/2025-vllm-anatomy/pd.png 466.79kb 317.95kb 31.89%
/assets/figures/2025-vllm-anatomy/prefix_pt3.png 335.28kb 229.30kb 31.61%
/assets/figures/2025-vllm-anatomy/multiprocexecutor.png 77.19kb 52.85kb 31.52%
/assets/figures/minimax-m1/moe.png 153.55kb 105.18kb 31.50%
/assets/figures/agent-lightning/3_agl.png 490.02kb 337.65kb 31.09%
/assets/figures/2025-vllm-anatomy/latency_diagram.png 36.55kb 25.20kb 31.05%
/assets/figures/ptpc/PTPC-Diagram.png 58.00kb 40.03kb 30.98%
/assets/figures/stack/stack-table.png 76.69kb 52.99kb 30.90%
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png 30.30kb 20.98kb 30.76%
/assets/figures/v1/torch_compile_cuda_graph.png 116.47kb 80.89kb 30.55%
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png 46.36kb 32.27kb 30.40%
/assets/figures/2025-vllm-nvidia-nemotron/figure1.png 275.57kb 191.96kb 30.34%
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png 38.33kb 26.80kb 30.08%
/assets/figures/stack/stack-ttft.png 80.26kb 56.55kb 29.54%
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png 25.92kb 18.35kb 29.19%
/assets/figures/spec-decode/figure9.png 270.40kb 192.07kb 28.97%
/assets/figures/2025-vllm-anatomy/fsm2.png 125.90kb 90.11kb 28.43%
/assets/figures/stack/stack-itl.png 73.40kb 52.56kb 28.39%
/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png 132.97kb 95.24kb 28.37%
/assets/figures/qwen3-next/qwen.png 126.26kb 90.69kb 28.17%
/assets/figures/2025-vllm-anatomy/specdec_pt1.png 177.17kb 127.33kb 28.13%
/assets/figures/agent-lightning/1_rewards.png 114.37kb 82.21kb 28.11%
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png 53.67kb 38.97kb 27.38%
/assets/figures/beyond-text/io-plugins-flow.png 205.52kb 149.43kb 27.29%
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png 28.33kb 20.62kb 27.23%
/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png 189.71kb 138.48kb 27.00%
/assets/figures/spec-decode/figure6.png 206.27kb 150.69kb 26.95%
/assets/figures/2025-vllm-anatomy/engine_loop.png 83.08kb 60.91kb 26.69%
/assets/figures/notes-vllm-vs-deepspeed/s2.png 33.97kb 24.92kb 26.64%
/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png 214.02kb 157.31kb 26.50%
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png 35.73kb 26.37kb 26.20%
/assets/figures/spec-decode/figure5.png 226.69kb 167.42kb 26.15%
/assets/figures/notes-vllm-vs-deepspeed/s1.png 38.58kb 28.54kb 26.01%
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png 46.08kb 34.10kb 26.00%
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png 22.04kb 16.34kb 25.88%
/assets/figures/v1/v1_server_architecture.png 299.93kb 222.39kb 25.85%
/assets/figures/2025-vllm-anatomy/roofline.png 47.72kb 35.52kb 25.56%
/assets/figures/2025-vllm-anatomy/fwd_pass.png 898.39kb 669.03kb 25.53%
/assets/figures/stack/stack-overview-2.png 435.36kb 326.14kb 25.09%
/assets/figures/struct-decode-intro/vllm-new-xgrammar.png 54.88kb 41.33kb 24.70%
/assets/figures/2025-vllm-anatomy/prefix_pt1.png 340.68kb 257.12kb 24.53%
/assets/figures/spec-decode/figure7.png 316.34kb 238.83kb 24.50%
/assets/figures/vllm-serving-amd/405b1.png 10.03kb 7.60kb 24.23%
/assets/figures/spec-decode/figure4.png 1,027.78kb 780.44kb 24.07%
/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png 188.65kb 143.32kb 24.03%
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png 23.40kb 17.80kb 23.94%
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png 25.50kb 19.44kb 23.76%
/assets/figures/vllm-serving-amd/70b1.png 10.44kb 7.97kb 23.63%
/assets/figures/spec-decode/figure1.png 961.60kb 734.73kb 23.59%
/assets/figures/2025-vllm-anatomy/specdec_pt2.png 203.24kb 155.69kb 23.40%
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png 31.49kb 24.29kb 22.86%
/assets/figures/ptpc/PTPC121.png 26.69kb 20.80kb 22.09%
/assets/figures/struct-decode-intro/mermaid-intro.svg 15.28kb 11.98kb 21.61%
/assets/figures/deepseek-v3-2/mla-indexer-block.png 111.70kb 87.80kb 21.39%
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png 42.56kb 33.46kb 21.37%
/assets/figures/transformers-backend/transformers-backend.png 53.48kb 42.23kb 21.03%
/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png 42.05kb 33.23kb 20.96%
/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png 33.58kb 26.57kb 20.88%
/assets/figures/2025-torch-compile/figure5_b.png 104.64kb 82.95kb 20.73%
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png 20.02kb 15.90kb 20.60%
/assets/figures/openrlhf-vllm/ray.png 106.83kb 84.96kb 20.48%
/assets/figures/vllm-serving-amd/405b2.png 13.67kb 10.87kb 20.47%
/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png 36.72kb 29.31kb 20.18%
/assets/figures/vllm-serving-amd/70b2.png 13.96kb 11.29kb 19.14%
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png 32.98kb 26.71kb 19.02%
/assets/figures/semantic-router/request.png 104.31kb 84.53kb 18.97%
/assets/figures/aibrix/aibrix-diagram.png 209.78kb 170.00kb 18.96%
/assets/figures/qwen3-next/hybrid.png 24.10kb 19.55kb 18.86%
/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png 35.60kb 29.01kb 18.49%
/assets/figures/v1/v1_qwen2vl.png 177.75kb 145.00kb 18.43%
/assets/figures/vllm-tpu/whats-new.png 244.90kb 200.91kb 17.97%
/assets/figures/vllm-tpu/vllm-serve-model.png 75.67kb 62.20kb 17.80%
/assets/figures/perf-v060/overall_throughput.png 106.84kb 87.95kb 17.68%
/assets/figures/annimation0.gif 114.58kb 94.77kb 17.30%
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png 28.37kb 23.73kb 16.37%
/assets/figures/semantic-router/signal.png 1,710.38kb 1,431.48kb 16.31%
/assets/figures/stack/stack-thumbnail.png 178.63kb 149.58kb 16.26%
/assets/figures/stack/stack-panel.png 204.38kb 171.16kb 16.25%
/assets/figures/agent-lightning/2_having.png 347.56kb 291.18kb 16.22%
/assets/figures/v1/v1_llama.png 275.17kb 230.64kb 16.18%
/assets/figures/llama31/perf_llama3.png 36.10kb 30.31kb 16.05%
/assets/figures/2025-vllm-sleep-mode/sleepmode.png 1,379.70kb 1,168.53kb 15.31%
/assets/figures/perf-v060/llama70B_comparison.png 93.75kb 79.57kb 15.13%
/assets/figures/perf-v060/throughput.png 228.48kb 194.82kb 14.73%
/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png 29.76kb 25.43kb 14.54%
/assets/figures/perf-v060/A100_70B.png 612.05kb 523.07kb 14.54%
/assets/figures/v1/v1_prefix_caching.png 124.52kb 106.58kb 14.41%
/assets/figures/perf-v060/llama8B_comparison.png 96.75kb 82.93kb 14.29%
/assets/figures/2025-torch-compile/figure1.png 31.63kb 27.22kb 13.95%
/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png 33.22kb 28.60kb 13.92%
/assets/figures/perf-v060/H100_70B.png 586.32kb 506.69kb 13.58%
/assets/figures/perf-v060/A100_8B.png 579.32kb 503.23kb 13.14%
/assets/figures/perf-v060/H100_8B.png 563.44kb 492.73kb 12.55%
/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png 75.68kb 66.80kb 11.73%
/assets/figures/ptpc/PTPC-tumbnail.png 47.64kb 42.08kb 11.67%
/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png 16.65kb 14.92kb 10.42%
/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png 36.49kb 32.98kb 9.61%
/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png 85.38kb 77.59kb 9.12%
/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png 22.26kb 20.50kb 7.91%
/assets/figures/distributed-inference/tp_strategies.png 84.22kb 77.62kb 7.84%
/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png 213.44kb 197.57kb 7.44%
/assets/figures/beyond-text/prithvi-prediction.png 4,898.67kb 4,550.72kb 7.10%
/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png 50.61kb 47.51kb 6.12%
/assets/figures/semantic-router/full-params.png 1,609.11kb 1,513.74kb 5.93%
/assets/figures/2025-vllm-on-intel-arc/thread-load2.png 10.19kb 9.64kb 5.44%
/assets/figures/semantic-router/modular.png 1,793.29kb 1,697.53kb 5.34%
/assets/figures/semantic-router/lora.png 1,816.80kb 1,719.94kb 5.33%
/assets/figures/vllm-meetup/vllm_meetup_HSkim.png 3,673.41kb 3,485.31kb 5.12%
/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png 48.81kb 46.38kb 4.98%
/assets/figures/lfai/vllm-lfai-light.png 416.27kb 396.23kb 4.81%
/assets/figures/2025-vllm-on-intel-arc/thread-load1.png 10.05kb 9.57kb 4.75%
/assets/figures/perf_a100_n1_dark.png 266.90kb 254.44kb 4.67%
/assets/figures/perf_a100_n3_dark.png 259.18kb 247.68kb 4.44%
/assets/figures/perf_a10g_n3_dark.png 254.55kb 243.36kb 4.40%
/assets/figures/perf_a10g_n1_dark.png 244.00kb 233.32kb 4.38%
/assets/figures/perf_a100_n1_light.png 284.77kb 272.66kb 4.25%
/assets/figures/perf_a10g_n1_light.png 260.24kb 249.59kb 4.09%
/assets/figures/perf_a10g_n3_light.png 271.77kb 261.02kb 3.96%
/assets/figures/vllm-meetup/vllm_meetup_Daniele.png 2,380.53kb 2,287.36kb 3.91%
/assets/figures/perf_a100_n3_light.png 275.74kb 265.05kb 3.88%
/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg 1,297.26kb 1,256.60kb 3.13%
/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg 1,113.18kb 1,088.45kb 2.22%
/assets/figures/vllm-meetup/image-3.png 2,606.86kb 2,578.41kb 1.09%
/assets/figures/vllm-meetup/image-6.png 3,119.93kb 3,097.48kb 0.72%
/assets/figures/minimax-m1/benchmark.png 208.27kb 207.39kb 0.42%
/assets/figures/ptpc/PTPCSpeedup.svg 10.01kb 9.97kb 0.39%
/assets/figures/2025-vllm-nvidia-nemotron/figure2.png 220.81kb 220.28kb 0.24%
/assets/figures/vllm-meetup/image-2.png 3,496.04kb 3,491.61kb 0.13%
/assets/figures/ptpc/PTPCReqs.svg 18.27kb 18.26kb 0.07%
/assets/figures/agent-lightning/4_tasks-spans-loop.svg 128.17kb 128.16kb 0.01%
Total : 62,985.50kb 51,682.67kb 17.95%

📝 docs | :octocat: repo | 🙋🏾 issues | 🏪 marketplace

~Imgbot - Part of Optimole family

Xunzhuo and others added 2 commits November 19, 2025 10:15
*Total -- 62,985.50kb -> 51,682.67kb (17.95%)

/assets/figures/semantic-router/signal-5.png -- 392.85kb -> 98.53kb (74.92%)
/assets/figures/semantic-router/signal-2.png -- 438.48kb -> 111.89kb (74.48%)
/assets/figures/semantic-router/signal-7.png -- 190.01kb -> 51.13kb (73.09%)
/assets/figures/semantic-router/signal-1.png -- 136.21kb -> 38.90kb (71.44%)
/assets/figures/semantic-router/signal-6.png -- 172.88kb -> 50.59kb (70.74%)
/assets/figures/semantic-router/signal-3.png -- 129.68kb -> 40.34kb (68.89%)
/assets/figures/annimation2.gif -- 183.06kb -> 63.14kb (65.51%)
/assets/figures/minimax-m1/lightning_attention.png -- 137.93kb -> 52.24kb (62.12%)
/assets/figures/vllm-2024-wrapped-2025-roadmap/gpu-hours-by-vendor.png -- 155.92kb -> 59.38kb (61.92%)
/assets/figures/2025-torch-compile/figure3.png -- 158.74kb -> 61.10kb (61.51%)
/assets/figures/2025-11-10-bitwise-exact-rl/floating-point-representation.png -- 58.85kb -> 23.65kb (59.82%)
/assets/figures/2025-vllm-anatomy/server_setup.png -- 41.53kb -> 16.85kb (59.42%)
/assets/figures/2025-torch-compile/figure4.png -- 120.94kb -> 50.23kb (58.46%)
/assets/figures/2025-torch-compile/figure2.png -- 104.33kb -> 43.59kb (58.22%)
/assets/figures/vllm-2024-wrapped-2025-roadmap/quantization-deployment-percentage.png -- 238.96kb -> 103.63kb (56.63%)
/assets/figures/2025-torch-compile/figure7.png -- 35.34kb -> 15.36kb (56.53%)
/assets/figures/2025-torch-compile/figure6.png -- 177.58kb -> 77.74kb (56.22%)
/assets/figures/semantic-router/signal-4.png -- 191.29kb -> 87.34kb (54.34%)
/assets/logos/vllm-logo-only-light.png -- 52.94kb -> 24.32kb (54.05%)
/assets/figures/2025-torch-compile/figure8.png -- 16.32kb -> 7.77kb (52.4%)
/assets/figures/semantic-router/signal-8.png -- 263.82kb -> 127.87kb (51.53%)
/assets/figures/blackwell-inferencemax/gpt-oss-120b-1k-1k.png -- 366.50kb -> 180.69kb (50.7%)
/assets/figures/blackwell-inferencemax/llama-70b-1k-8k.png -- 366.67kb -> 181.65kb (50.46%)
/assets/figures/spec-decode/figure2.png -- 198.98kb -> 101.82kb (48.83%)
/assets/figures/2025-shm-ipc-cache/processes2.png -- 160.40kb -> 82.54kb (48.55%)
/assets/figures/struct-decode-intro/shogoth-gpt.png -- 53.86kb -> 27.74kb (48.51%)
/assets/figures/2025-shm-ipc-cache/processes1.png -- 144.34kb -> 74.98kb (48.05%)
/assets/figures/llama4/perf.png -- 514.08kb -> 268.36kb (47.8%)
/assets/logos/vllm-logo-text-light.png -- 87.87kb -> 45.92kb (47.74%)
/assets/figures/vllm-tpu/llama3-8b-throughput-progress.png -- 47.28kb -> 24.75kb (47.65%)
/assets/figures/perf-v060/illustration-async-output-processing.png -- 189.81kb -> 99.47kb (47.6%)
/assets/figures/vllm-tpu/llama3-70b-throughput-progress.png -- 43.75kb -> 23.12kb (47.15%)
/assets/figures/spec-decode/figure3.png -- 147.58kb -> 78.34kb (46.92%)
/assets/figures/annimation3.gif -- 449.13kb -> 240.12kb (46.54%)
/assets/figures/vllm-2024-wrapped-2025-roadmap/model-architecture-serving-usage.png -- 209.66kb -> 112.70kb (46.25%)
/assets/logos/vllm-logo-text-dark.png -- 86.27kb -> 46.79kb (45.76%)
/assets/figures/2025-shm-ipc-cache/shared_memory_object_store.png -- 247.46kb -> 134.76kb (45.54%)
/assets/figures/perf-v060/illustration-multi-step.png -- 216.43kb -> 118.06kb (45.45%)
/assets/figures/deepseek-v3-2/dsa-explained.png -- 1,178.67kb -> 663.26kb (43.73%)
/assets/figures/distributed-inference/kv_cache_effects.png -- 85.85kb -> 48.72kb (43.25%)
/assets/figures/perf-v060/illustration-api-server.png -- 194.16kb -> 112.27kb (42.18%)
/assets/figures/v1/v1_tp_architecture.png -- 172.54kb -> 101.53kb (41.16%)
/assets/figures/v1/v1_scheduling.png -- 133.57kb -> 79.02kb (40.84%)
/assets/figures/spec-decode/figure8.png -- 168.19kb -> 100.25kb (40.39%)
/assets/figures/2025-vllm-anatomy/fsm.png -- 137.98kb -> 84.14kb (39.02%)
/assets/figures/beyond-text/models-diff.png -- 381.34kb -> 239.89kb (37.09%)
/assets/figures/annimation1.gif -- 342.45kb -> 216.73kb (36.71%)
/assets/figures/v1/vLLM_V1_Logo.png -- 37.17kb -> 23.76kb (36.09%)
/assets/figures/lmsys_traffic.png -- 196.11kb -> 125.48kb (36.01%)
/assets/figures/2025-11-10-bitwise-exact-rl/tensorboard-plot.png -- 634.28kb -> 410.35kb (35.3%)
/assets/figures/spec-decode/figure10.png -- 77.45kb -> 50.38kb (34.96%)
/assets/figures/2025-11-10-bitwise-exact-rl/rl-script-demo.png -- 667.21kb -> 437.30kb (34.46%)
/assets/figures/2025-11-10-bitwise-exact-rl/reward-comparison.png -- 577.02kb -> 378.58kb (34.39%)
/assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png -- 66.28kb -> 43.66kb (34.12%)
/assets/figures/vllm-tpu/vllm-tpu.png -- 189.11kb -> 125.13kb (33.83%)
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TPOT (ms).png -- 31.83kb -> 21.10kb (33.71%)
/assets/figures/v1/persistent_batch.png -- 78.32kb -> 52.14kb (33.43%)
/assets/figures/2025-vllm-anatomy/prefix_pt2.png -- 182.58kb -> 122.45kb (32.93%)
/assets/figures/2025-vllm-anatomy/engine_constructor.png -- 185.46kb -> 124.41kb (32.92%)
/assets/figures/2025-vllm-anatomy/chunked_pt1.png -- 205.58kb -> 138.92kb (32.43%)
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Requests Per Second.png -- 35.06kb -> 23.71kb (32.36%)
/assets/figures/2025-vllm-anatomy/pd.png -- 466.79kb -> 317.95kb (31.89%)
/assets/figures/2025-vllm-anatomy/prefix_pt3.png -- 335.28kb -> 229.30kb (31.61%)
/assets/figures/2025-vllm-anatomy/multiprocexecutor.png -- 77.19kb -> 52.85kb (31.52%)
/assets/figures/minimax-m1/moe.png -- 153.55kb -> 105.18kb (31.5%)
/assets/figures/agent-lightning/3_agl.png -- 490.02kb -> 337.65kb (31.09%)
/assets/figures/2025-vllm-anatomy/latency_diagram.png -- 36.55kb -> 25.20kb (31.05%)
/assets/figures/ptpc/PTPC-Diagram.png -- 58.00kb -> 40.03kb (30.98%)
/assets/figures/stack/stack-table.png -- 76.69kb -> 52.99kb (30.9%)
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Requests per Second.png -- 30.30kb -> 20.98kb (30.76%)
/assets/figures/v1/torch_compile_cuda_graph.png -- 116.47kb -> 80.89kb (30.55%)
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Requests per Second.png -- 46.36kb -> 32.27kb (30.4%)
/assets/figures/2025-vllm-nvidia-nemotron/figure1.png -- 275.57kb -> 191.96kb (30.34%)
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TPOT (ms).png -- 38.33kb -> 26.80kb (30.08%)
/assets/figures/stack/stack-ttft.png -- 80.26kb -> 56.55kb (29.54%)
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TPOT (ms).png -- 25.92kb -> 18.35kb (29.19%)
/assets/figures/spec-decode/figure9.png -- 270.40kb -> 192.07kb (28.97%)
/assets/figures/2025-vllm-anatomy/fsm2.png -- 125.90kb -> 90.11kb (28.43%)
/assets/figures/stack/stack-itl.png -- 73.40kb -> 52.56kb (28.39%)
/assets/figures/2025-vllm-anatomy/kv_cache_blocks.png -- 132.97kb -> 95.24kb (28.37%)
/assets/figures/qwen3-next/qwen.png -- 126.26kb -> 90.69kb (28.17%)
/assets/figures/2025-vllm-anatomy/specdec_pt1.png -- 177.17kb -> 127.33kb (28.13%)
/assets/figures/agent-lightning/1_rewards.png -- 114.37kb -> 82.21kb (28.11%)
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Requests per Second.png -- 53.67kb -> 38.97kb (27.38%)
/assets/figures/beyond-text/io-plugins-flow.png -- 205.52kb -> 149.43kb (27.29%)
/assets/figures/vllm-serving-amd/case05-amd-recommended-environmental-variables/Mean TTFT (ms).png -- 28.33kb -> 20.62kb (27.23%)
/assets/figures/2025-vllm-anatomy/dpenginecoreproc.png -- 189.71kb -> 138.48kb (27%)
/assets/figures/spec-decode/figure6.png -- 206.27kb -> 150.69kb (26.95%)
/assets/figures/2025-vllm-anatomy/engine_loop.png -- 83.08kb -> 60.91kb (26.69%)
/assets/figures/notes-vllm-vs-deepspeed/s2.png -- 33.97kb -> 24.92kb (26.64%)
/assets/figures/2025-11-10-bitwise-exact-rl/bf16-rounding-example.png -- 214.02kb -> 157.31kb (26.5%)
/assets/figures/vllm-serving-amd/case03-chunked-prefill-and-prefix-caching/Mean TTFT (ms).png -- 35.73kb -> 26.37kb (26.2%)
/assets/figures/spec-decode/figure5.png -- 226.69kb -> 167.42kb (26.15%)
/assets/figures/notes-vllm-vs-deepspeed/s1.png -- 38.58kb -> 28.54kb (26.01%)
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TPOT (ms).png -- 46.08kb -> 34.10kb (26%)
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TPOT (ms).png -- 22.04kb -> 16.34kb (25.88%)
/assets/figures/v1/v1_server_architecture.png -- 299.93kb -> 222.39kb (25.85%)
/assets/figures/2025-vllm-anatomy/roofline.png -- 47.72kb -> 35.52kb (25.56%)
/assets/figures/2025-vllm-anatomy/fwd_pass.png -- 898.39kb -> 669.03kb (25.53%)
/assets/figures/stack/stack-overview-2.png -- 435.36kb -> 326.14kb (25.09%)
/assets/figures/struct-decode-intro/vllm-new-xgrammar.png -- 54.88kb -> 41.33kb (24.7%)
/assets/figures/2025-vllm-anatomy/prefix_pt1.png -- 340.68kb -> 257.12kb (24.53%)
/assets/figures/spec-decode/figure7.png -- 316.34kb -> 238.83kb (24.5%)
/assets/figures/vllm-serving-amd/405b1.png -- 10.03kb -> 7.60kb (24.23%)
/assets/figures/spec-decode/figure4.png -- 1,027.78kb -> 780.44kb (24.07%)
/assets/figures/2025-11-10-bitwise-exact-rl/rounding-sequence.png -- 188.65kb -> 143.32kb (24.03%)
/assets/figures/vllm-serving-amd/case02-num-scheduler-steps/Mean TTFT (ms).png -- 23.40kb -> 17.80kb (23.94%)
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Requests Per Second.png -- 25.50kb -> 19.44kb (23.76%)
/assets/figures/vllm-serving-amd/70b1.png -- 10.44kb -> 7.97kb (23.63%)
/assets/figures/spec-decode/figure1.png -- 961.60kb -> 734.73kb (23.59%)
/assets/figures/2025-vllm-anatomy/specdec_pt2.png -- 203.24kb -> 155.69kb (23.4%)
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TPOT (ms).png -- 31.49kb -> 24.29kb (22.86%)
/assets/figures/ptpc/PTPC121.png -- 26.69kb -> 20.80kb (22.09%)
/assets/figures/struct-decode-intro/mermaid-intro.svg -- 15.28kb -> 11.98kb (21.61%)
/assets/figures/deepseek-v3-2/mla-indexer-block.png -- 111.70kb -> 87.80kb (21.39%)
/assets/figures/vllm-serving-amd/case04-max-seq-len-to-capture/Mean TTFT (ms).png -- 42.56kb -> 33.46kb (21.37%)
/assets/figures/transformers-backend/transformers-backend.png -- 53.48kb -> 42.23kb (21.03%)
/assets/figures/vllm-serving-amd/case08-max-num-seq/Request per Second.png -- 42.05kb -> 33.23kb (20.96%)
/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TPOT (ms).png -- 33.58kb -> 26.57kb (20.88%)
/assets/figures/2025-torch-compile/figure5_b.png -- 104.64kb -> 82.95kb (20.73%)
/assets/figures/vllm-serving-amd/case01-chunked-prefill/Mean TTFT (ms).png -- 20.02kb -> 15.90kb (20.6%)
/assets/figures/openrlhf-vllm/ray.png -- 106.83kb -> 84.96kb (20.48%)
/assets/figures/vllm-serving-amd/405b2.png -- 13.67kb -> 10.87kb (20.47%)
/assets/figures/vllm-serving-amd/case06-kvcache-type/Requests per Second.png -- 36.72kb -> 29.31kb (20.18%)
/assets/figures/vllm-serving-amd/70b2.png -- 13.96kb -> 11.29kb (19.14%)
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Requests per Second.png -- 32.98kb -> 26.71kb (19.02%)
/assets/figures/semantic-router/request.png -- 104.31kb -> 84.53kb (18.97%)
/assets/figures/aibrix/aibrix-diagram.png -- 209.78kb -> 170.00kb (18.96%)
/assets/figures/qwen3-next/hybrid.png -- 24.10kb -> 19.55kb (18.86%)
/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TPOT (ms).png -- 35.60kb -> 29.01kb (18.49%)
/assets/figures/v1/v1_qwen2vl.png -- 177.75kb -> 145.00kb (18.43%)
/assets/figures/vllm-tpu/whats-new.png -- 244.90kb -> 200.91kb (17.97%)
/assets/figures/vllm-tpu/vllm-serve-model.png -- 75.67kb -> 62.20kb (17.8%)
/assets/figures/perf-v060/overall_throughput.png -- 106.84kb -> 87.95kb (17.68%)
/assets/figures/annimation0.gif -- 114.58kb -> 94.77kb (17.3%)
/assets/figures/vllm-serving-amd/case07-tensor-parallelism/Mean TTFT (ms).png -- 28.37kb -> 23.73kb (16.37%)
/assets/figures/semantic-router/signal.png -- 1,710.38kb -> 1,431.48kb (16.31%)
/assets/figures/stack/stack-thumbnail.png -- 178.63kb -> 149.58kb (16.26%)
/assets/figures/stack/stack-panel.png -- 204.38kb -> 171.16kb (16.25%)
/assets/figures/agent-lightning/2_having.png -- 347.56kb -> 291.18kb (16.22%)
/assets/figures/v1/v1_llama.png -- 275.17kb -> 230.64kb (16.18%)
/assets/figures/llama31/perf_llama3.png -- 36.10kb -> 30.31kb (16.05%)
/assets/figures/2025-vllm-sleep-mode/sleepmode.png -- 1,379.70kb -> 1,168.53kb (15.31%)
/assets/figures/perf-v060/llama70B_comparison.png -- 93.75kb -> 79.57kb (15.13%)
/assets/figures/perf-v060/throughput.png -- 228.48kb -> 194.82kb (14.73%)
/assets/figures/vllm-serving-amd/case06-kvcache-type/Mean TTFT (ms).png -- 29.76kb -> 25.43kb (14.54%)
/assets/figures/perf-v060/A100_70B.png -- 612.05kb -> 523.07kb (14.54%)
/assets/figures/v1/v1_prefix_caching.png -- 124.52kb -> 106.58kb (14.41%)
/assets/figures/perf-v060/llama8B_comparison.png -- 96.75kb -> 82.93kb (14.29%)
/assets/figures/2025-torch-compile/figure1.png -- 31.63kb -> 27.22kb (13.95%)
/assets/figures/vllm-serving-amd/case08-max-num-seq/Mean TTFT (ms).png -- 33.22kb -> 28.60kb (13.92%)
/assets/figures/perf-v060/H100_70B.png -- 586.32kb -> 506.69kb (13.58%)
/assets/figures/perf-v060/A100_8B.png -- 579.32kb -> 503.23kb (13.14%)
/assets/figures/perf-v060/H100_8B.png -- 563.44kb -> 492.73kb (12.55%)
/assets/figures/2025-vllm-on-intel-arc/perf-figure1.png -- 75.68kb -> 66.80kb (11.73%)
/assets/figures/ptpc/PTPC-tumbnail.png -- 47.64kb -> 42.08kb (11.67%)
/assets/figures/vllm-serving-amd/introduction/Throughput (Requests per Second).png -- 16.65kb -> 14.92kb (10.42%)
/assets/figures/2025-vllm-on-intel-arc/perf-figure2.png -- 36.49kb -> 32.98kb (9.61%)
/assets/figures/2025-vllm-on-intel-arc/perf-figure3.png -- 85.38kb -> 77.59kb (9.12%)
/assets/figures/vllm-serving-amd/introduction/Mean TTFT (ms).png -- 22.26kb -> 20.50kb (7.91%)
/assets/figures/distributed-inference/tp_strategies.png -- 84.22kb -> 77.62kb (7.84%)
/assets/figures/vllm-2024-wrapped-2025-roadmap/vllm-contributor-groups.png -- 213.44kb -> 197.57kb (7.44%)
/assets/figures/beyond-text/prithvi-prediction.png -- 4,898.67kb -> 4,550.72kb (7.1%)
/assets/figures/2025-vllm-on-intel-arc/persistent-kernel2.png -- 50.61kb -> 47.51kb (6.12%)
/assets/figures/semantic-router/full-params.png -- 1,609.11kb -> 1,513.74kb (5.93%)
/assets/figures/2025-vllm-on-intel-arc/thread-load2.png -- 10.19kb -> 9.64kb (5.44%)
/assets/figures/semantic-router/modular.png -- 1,793.29kb -> 1,697.53kb (5.34%)
/assets/figures/semantic-router/lora.png -- 1,816.80kb -> 1,719.94kb (5.33%)
/assets/figures/vllm-meetup/vllm_meetup_HSkim.png -- 3,673.41kb -> 3,485.31kb (5.12%)
/assets/figures/2025-vllm-on-intel-arc/persistent-kernel1.png -- 48.81kb -> 46.38kb (4.98%)
/assets/figures/lfai/vllm-lfai-light.png -- 416.27kb -> 396.23kb (4.81%)
/assets/figures/2025-vllm-on-intel-arc/thread-load1.png -- 10.05kb -> 9.57kb (4.75%)
/assets/figures/perf_a100_n1_dark.png -- 266.90kb -> 254.44kb (4.67%)
/assets/figures/perf_a100_n3_dark.png -- 259.18kb -> 247.68kb (4.44%)
/assets/figures/perf_a10g_n3_dark.png -- 254.55kb -> 243.36kb (4.4%)
/assets/figures/perf_a10g_n1_dark.png -- 244.00kb -> 233.32kb (4.38%)
/assets/figures/perf_a100_n1_light.png -- 284.77kb -> 272.66kb (4.25%)
/assets/figures/perf_a10g_n1_light.png -- 260.24kb -> 249.59kb (4.09%)
/assets/figures/perf_a10g_n3_light.png -- 271.77kb -> 261.02kb (3.96%)
/assets/figures/vllm-meetup/vllm_meetup_Daniele.png -- 2,380.53kb -> 2,287.36kb (3.91%)
/assets/figures/perf_a100_n3_light.png -- 275.74kb -> 265.05kb (3.88%)
/assets/figures/vllm-meetup/vllm_meetup_nicolo.jpg -- 1,297.26kb -> 1,256.60kb (3.13%)
/assets/figures/vllm-meetup/vllm_meetup_HJKim.jpg -- 1,113.18kb -> 1,088.45kb (2.22%)
/assets/figures/vllm-meetup/image-3.png -- 2,606.86kb -> 2,578.41kb (1.09%)
/assets/figures/vllm-meetup/image-6.png -- 3,119.93kb -> 3,097.48kb (0.72%)
/assets/figures/minimax-m1/benchmark.png -- 208.27kb -> 207.39kb (0.42%)
/assets/figures/ptpc/PTPCSpeedup.svg -- 10.01kb -> 9.97kb (0.39%)
/assets/figures/2025-vllm-nvidia-nemotron/figure2.png -- 220.81kb -> 220.28kb (0.24%)
/assets/figures/vllm-meetup/image-2.png -- 3,496.04kb -> 3,491.61kb (0.13%)
/assets/figures/ptpc/PTPCReqs.svg -- 18.27kb -> 18.26kb (0.07%)
/assets/figures/agent-lightning/4_tasks-spans-loop.svg -- 128.17kb -> 128.16kb (0.01%)

Signed-off-by: ImgBotApp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants