diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
index 14dbfe3ea1d3..856874d51961 100644
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -23,11 +23,7 @@
   - local: using-diffusers/reusing_seeds
     title: Reproducibility
   - local: using-diffusers/schedulers
-    title: Load schedulers and models
-  - local: using-diffusers/models
-    title: Models
-  - local: using-diffusers/scheduler_features
-    title: Scheduler features
+    title: Schedulers
   - local: using-diffusers/other-formats
     title: Model files and layouts
   - local: using-diffusers/push_to_hub
diff --git a/docs/source/en/using-diffusers/models.md b/docs/source/en/using-diffusers/models.md
deleted file mode 100644
index 22c78d490ae4..000000000000
--- a/docs/source/en/using-diffusers/models.md
+++ /dev/null
@@ -1,120 +0,0 @@
-
-
-[[open-in-colab]]
-
-# Models
-
-A diffusion model relies on a few individual models working together to generate an output. These models are responsible for denoising, encoding inputs, and decoding latents into the actual outputs.
-
-This guide will show you how to load models.
-
-## Loading a model
-
-All models are loaded with the [`~ModelMixin.from_pretrained`] method, which downloads and caches the latest model version. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache.
-
-Pass the `subfolder` argument to [`~ModelMixin.from_pretrained`] to specify where to load the model weights from. Omit the `subfolder` argument if the repository doesn't have a subfolder structure or if you're loading a standalone model.
-
-```py
-from diffusers import QwenImageTransformer2DModel
-
-model = QwenImageTransformer2DModel.from_pretrained("Qwen/Qwen-Image", subfolder="transformer")
-```
-
-## AutoModel
-
-[`AutoModel`] detects the model class from a `model_index.json` file or a model's `config.json` file. It fetches the correct model class from these files and delegates the actual loading to the model class. [`AutoModel`] is useful for automatic model type detection without needing to know the exact model class beforehand.
-
-```py
-from diffusers import AutoModel
-
-model = AutoModel.from_pretrained(
-    "Qwen/Qwen-Image", subfolder="transformer"
-)
-```
-
-## Model data types
-
-Use the `torch_dtype` argument in [`~ModelMixin.from_pretrained`] to load a model with a specific data type. This allows you to load a model in a lower precision to reduce memory usage.
-
-```py
-import torch
-from diffusers import QwenImageTransformer2DModel
-
-model = QwenImageTransformer2DModel.from_pretrained(
-    "Qwen/Qwen-Image",
-    subfolder="transformer",
-    torch_dtype=torch.bfloat16
-)
-```
-
-[nn.Module.to](https://docs.pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.to) can also convert to a specific data type on the fly. However, it converts *all* weights to the requested data type unlike `torch_dtype` which respects `_keep_in_fp32_modules`. This argument preserves layers in `torch.float32` for numerical stability and best generation quality (see example [_keep_in_fp32_modules](https://github.com/huggingface/diffusers/blob/f864a9a352fa4a220d860bfdd1782e3e5af96382/src/diffusers/models/transformers/transformer_wan.py#L374))
-
-```py
-from diffusers import QwenImageTransformer2DModel
-
-model = QwenImageTransformer2DModel.from_pretrained(
-    "Qwen/Qwen-Image", subfolder="transformer"
-)
-model = model.to(dtype=torch.float16) 
-```
-
-## Device placement
-
-Use the `device_map` argument in [`~ModelMixin.from_pretrained`] to place a model on an accelerator like a GPU. It is especially helpful where there are multiple GPUs.
-
-Diffusers currently provides three options to `device_map` for individual models, `"cuda"`, `"balanced"` and `"auto"`. Refer to the table below to compare the three placement strategies.
-
-| parameter | description |
-|---|---|
-| `"cuda"` | places pipeline on a supported accelerator (CUDA) |
-| `"balanced"` | evenly distributes pipeline on all GPUs |
-| `"auto"` | distribute model from fastest device first to slowest |
-
-Use the `max_memory` argument in [`~ModelMixin.from_pretrained`] to allocate a maximum amount of memory to use on each device. By default, Diffusers uses the maximum amount available.
-
-```py
-import torch
-from diffusers import QwenImagePipeline
-
-max_memory = {0: "16GB", 1: "16GB"}
-pipeline = QwenImagePipeline.from_pretrained(
-    "Qwen/Qwen-Image", 
-    torch_dtype=torch.bfloat16,
-    device_map="cuda",
-    max_memory=max_memory
-)
-```
-
-The `hf_device_map` attribute allows you to access and view the `device_map`.
-
-```py
-print(transformer.hf_device_map)
-# {'': device(type='cuda')}
-```
-
-## Saving models
-
-Save a model with the [`~ModelMixin.save_pretrained`] method.
-
-```py
-from diffusers import QwenImageTransformer2DModel
-
-model = QwenImageTransformer2DModel.from_pretrained("Qwen/Qwen-Image", subfolder="transformer")
-model.save_pretrained("./local/model")
-```
-
-For large models, it is helpful to use `max_shard_size` to save a model as multiple shards. A shard can be loaded faster and save memory (refer to the [parallel loading](./loading#parallel-loading) docs for more details), especially if there is more than one GPU.
-
-```py
-model.save_pretrained("./local/model", max_shard_size="5GB")
-```
diff --git a/docs/source/en/using-diffusers/scheduler_features.md b/docs/source/en/using-diffusers/scheduler_features.md
deleted file mode 100644
index f7977d53d5d6..000000000000
--- a/docs/source/en/using-diffusers/scheduler_features.md
+++ /dev/null
@@ -1,235 +0,0 @@
-
-
-# Scheduler features
-
-The scheduler is an important component of any diffusion model because it controls the entire denoising (or sampling) process. There are many types of schedulers, some are optimized for speed and some for quality. With Diffusers, you can modify the scheduler configuration to use custom noise schedules, sigmas, and rescale the noise schedule. Changing these parameters can have profound effects on inference quality and speed.
-
-This guide will demonstrate how to use these features to improve inference quality.
-
-> [!TIP]
-> Diffusers currently only supports the `timesteps` and `sigmas` parameters for a select list of schedulers and pipelines. Feel free to open a [feature request](https://github.com/huggingface/diffusers/issues/new/choose) if you want to extend these parameters to a scheduler and pipeline that does not currently support it!
-
-## Timestep schedules
-
-The timestep or noise schedule determines the amount of noise at each sampling step. The scheduler uses this to generate an image with the corresponding amount of noise at each step. The timestep schedule is generated from the scheduler's default configuration, but you can customize the scheduler to use new and optimized sampling schedules that aren't in Diffusers yet.
-
-For example, [Align Your Steps (AYS)](https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/) is a method for optimizing a sampling schedule to generate a high-quality image in as little as 10 steps. The optimal [10-step schedule](https://github.com/huggingface/diffusers/blob/a7bf77fc284810483f1e60afe34d1d27ad91ce2e/src/diffusers/schedulers/scheduling_utils.py#L51) for Stable Diffusion XL is:
-
-```py
-from diffusers.schedulers import AysSchedules
-
-sampling_schedule = AysSchedules["StableDiffusionXLTimesteps"]
-print(sampling_schedule)
-"[999, 845, 730, 587, 443, 310, 193, 116, 53, 13]"
-```
-
-You can use the AYS sampling schedule in a pipeline by passing it to the `timesteps` parameter.
-
-```py
-pipeline = StableDiffusionXLPipeline.from_pretrained(
-    "SG161222/RealVisXL_V4.0",
-    torch_dtype=torch.float16,
-    variant="fp16",
-).to("cuda")
-pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, algorithm_type="sde-dpmsolver++")
-
-prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
-generator = torch.Generator(device="cpu").manual_seed(2487854446)
-image = pipeline(
-    prompt=prompt,
-    negative_prompt="",
-    generator=generator,
-    timesteps=sampling_schedule,
-).images[0]
-```
-
-
-  
-    

-    
AYS timestep schedule 10 steps
-  
-  
-    

-    
Linearly-spaced timestep schedule 10 steps
-  
-  
-    

-    
Linearly-spaced timestep schedule 25 steps
-  
-
-  
-    

-    
trailing spacing after 5 steps
-  
-  
-    

-    
leading spacing after 5 steps
-  
-
-  
-    

-    
Karras sigmas enabled
-  
-  
-    

-    
Karras sigmas disabled
-  
-
-  
-    

-    
default Stable Diffusion v2-1 image
-  
-  
-    

-    
image with zero SNR and trailing timestep spacing enabled
-  
-
+  
+    

+    
AYS timestep schedule 10 steps
+  
+  
+    

+    
Linearly-spaced timestep schedule 10 steps
+  
+  
+    

+    
Linearly-spaced timestep schedule 25 steps
+  
+
+  
+    

+    
default Stable Diffusion v2-1 image
+  
+  
+    

+    
image with zero SNR and trailing timestep spacing enabled
+  
+
   
-    

-    
LMSDiscreteScheduler
+    

+    
trailing spacing after 5 steps
   
-    

-    
EulerDiscreteScheduler
-  
-
 
-
-  
-    

-    
EulerAncestralDiscreteScheduler
-  
-  
-    

-    
DPMSolverMultistepScheduler
+    

+    
leading spacing after 5 steps
   
+  
+    

+    
Karras sigmas enabled
+  
+  
+    

+    
Karras sigmas disabled
+  
+