Commit 55de3d0
authored
Merge branch 'main' into patryk/offline-eplb
Signed-off-by: PatrykSaffer <[email protected]>File tree
1,211 files changed
+58330
-22843
lines changed- .buildkite
- performance-benchmarks
- scripts
- tests
- scripts
- hardware_ci
- scheduled_integration_test
- .github
- workflows
- benchmarks
- auto_tune
- disagg_benchmarks
- kernels
- deepgemm
- multi_turn
- cmake
- external_projects
- csrc
- attention
- cpu
- micro_gemm
- moe
- marlin_moe_wna16
- quantization
- fp4
- gptq_allspark
- gptq_marlin
- hadamard/hadacore
- w8a8/cutlass
- c3x
- docker
- docs
- assets/contributing
- benchmarking
- cli
- bench
- sweep
- community
- configuration
- contributing
- ci
- model
- deployment
- frameworks
- integrations
- design
- features
- quantization
- getting_started
- installation
- mkdocs/hooks
- models
- hardware_supported_models
- serving
- usage
- examples
- offline_inference
- pooling
- profiling_tpu
- qwen3_omni
- online_serving
- pooling
- prometheus_grafana
- pooling
- classify
- embed
- openai_embedding_long_text
- plugin
- pooling
- score
- token_classify
- token_embed
- requirements
- tests
- basic_correctness
- benchmarks
- compile
- distributed
- fullgraph
- config
- distributed
- engine
- entrypoints
- offline_mode
- openai
- tool_parsers
- pooling
- basic
- classify
- embed
- pooling
- reward
- score
- sagemaker
- kernels
- attention
- core
- moe
- modular_kernel_tools
- quantization
- lora
- model_executor
- model_loader
- fastsafetensors_loader
- models
- language
- generation
- pooling_mteb_test
- pooling
- multimodal
- generation
- vlm_utils
- pooling
- processing
- quantization
- multimodal
- assets
- plugins_tests
- plugins
- lora_resolvers
- prithvi_io_processor_plugin/prithvi_io_processor
- quantization
- reasoning
- rocm/aiter
- samplers
- tokenization
- tokenizers_
- tool_use
- transformers_utils
- utils_
- v1
- attention
- core
- cudagraph
- determinism
- distributed
- e2e
- ec_connector/integration
- engine
- entrypoints
- llm
- openai
- serving_responses
- kv_connector
- nixl_integration
- unit
- kv_offload
- sample
- spec_decode
- tpu
- worker
- worker
- weight_loading
- tools
- ep_kernels
- pre_commit
- vllm
- attention
- backends
- layers
- ops
- utils
- benchmarks
- sweep
- compilation
- config
- distributed
- device_communicators
- ec_transfer/ec_connector
- eplb
- kv_transfer/kv_connector/v1
- lmcache_integration
- p2p
- engine
- entrypoints
- cli
- openai
- parser
- tool_parsers
- pooling
- classify
- embed
- pooling
- score
- sagemaker
- serve
- disagg
- elastic_ep
- instrumentator
- lora
- profile
- rlhf
- sleep
- tokenize
- inputs
- logging_utils
- lora
- layers
- ops/triton_ops
- punica_wrapper
- model_executor
- layers
- fused_moe
- configs
- mamba
- ops
- quantization
- compressed_tensors
- schemes
- kernels/mixed_precision
- quark
- schemes
- utils
- configs
- rotary_embedding
- model_loader
- models
- transformers
- multimodal
- platforms
- plugins
- io_processors
- lora_resolvers
- profiler
- ray
- reasoning
- tokenizers
- transformers_utils
- configs
- processors
- tokenizers
- triton_utils
- utils
- v1
- attention/backends
- mla
- core
- sched
- engine
- executor
- kv_offload
- worker
- metrics
- pool
- sample
- logits_processor
- ops
- tpu
- spec_decode
- structured_output
- worker
- gpu
- sample
- spec_decode
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
1,211 files changed
+58330
-22843
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
111 | 170 | | |
112 | 171 | | |
113 | 172 | | |
| |||
Lines changed: 43 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
| 113 | + | |
| 114 | + | |
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
| |||
316 | 317 | | |
317 | 318 | | |
318 | 319 | | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
319 | 330 | | |
320 | 331 | | |
321 | 332 | | |
322 | 333 | | |
323 | 334 | | |
324 | | - | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
325 | 358 | | |
326 | 359 | | |
327 | 360 | | |
| |||
335 | 368 | | |
336 | 369 | | |
337 | 370 | | |
338 | | - | |
| 371 | + | |
339 | 372 | | |
340 | 373 | | |
341 | 374 | | |
| 375 | + | |
342 | 376 | | |
343 | 377 | | |
344 | 378 | | |
| 379 | + | |
| 380 | + | |
345 | 381 | | |
346 | 382 | | |
347 | 383 | | |
| 384 | + | |
| 385 | + | |
348 | 386 | | |
349 | 387 | | |
350 | | - | |
351 | | - | |
| 388 | + | |
| 389 | + | |
352 | 390 | | |
353 | 391 | | |
354 | 392 | | |
| |||
0 commit comments