Skip to content

Commit 497d5c5

Browse files
committed
update
1 parent 4188140 commit 497d5c5

File tree

5 files changed

+20
-4
lines changed

5 files changed

+20
-4
lines changed

ch05/16_qwen3.5/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This folder contains a from-scratch style implementation of [Qwen/Qwen3.5-0.8B](
66

77
Qwen3.5 is based on the Qwen3-Next architecture, which I described in more detail in section [2. (Linear) Attention Hybrids](https://magazine.sebastianraschka.com/i/177848019/2-linear-attention-hybrids) of my [Beyond Standard LLMs](https://magazine.sebastianraschka.com/p/beyond-standard-llms) article
88

9+
<a href="https://magazine.sebastianraschka.com/p/beyond-standard-llms"><img src="https://sebastianraschka.com/images/LLMs-from-scratch-images/bonus/qwen3.5/02.webp" width="500px"></a>
910

1011
Note that Qwen3.5 alternates `linear_attention` and `full_attention` layers.
1112
The notebooks keep the full model flow readable while reusing the linear-attention building blocks from the [qwen3_5_transformers.py](qwen3_5_transformers.py), which contains the linear attention code from Hugging Face under an Apache version 2.0 open source license.

ch05/16_qwen3.5/qwen3.5-plus-kv-cache.ipynb

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,14 @@
6060
"- Qwen3.5 is based on the Qwen3-Next architecture, which I described in more detail in section [2. (Linear) Attention Hybrids](https://magazine.sebastianraschka.com/i/177848019/2-linear-attention-hybrids) of my [Beyond Standard LLMs](https://magazine.sebastianraschka.com/p/beyond-standard-llms) article"
6161
]
6262
},
63+
{
64+
"cell_type": "markdown",
65+
"id": "21d38944-0c98-40a6-a6f8-c745769b4618",
66+
"metadata": {},
67+
"source": [
68+
"<a href=\"https://magazine.sebastianraschka.com/p/beyond-standard-llms\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/bonus/qwen3.5/02.webp\" width=\"500px\"></a>"
69+
]
70+
},
6371
{
6472
"cell_type": "code",
6573
"execution_count": 1,
@@ -136,7 +144,6 @@
136144
"source": [
137145
"import torch\n",
138146
"import torch.nn as nn\n",
139-
"import torch.nn.functional as F\n",
140147
"\n",
141148
"\n",
142149
"class FeedForward(nn.Module):\n",

ch05/16_qwen3.5/qwen3.5.ipynb

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,14 @@
6060
"- Qwen3.5 is based on the Qwen3-Next architecture, which I described in more detail in section [2. (Linear) Attention Hybrids](https://magazine.sebastianraschka.com/i/177848019/2-linear-attention-hybrids) of my [Beyond Standard LLMs](https://magazine.sebastianraschka.com/p/beyond-standard-llms) article"
6161
]
6262
},
63+
{
64+
"cell_type": "markdown",
65+
"id": "402a446f-4efe-41f5-acc0-4f8455846aa5",
66+
"metadata": {},
67+
"source": [
68+
"<a href=\"https://magazine.sebastianraschka.com/p/beyond-standard-llms\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/bonus/qwen3.5/02.webp\" width=\"500px\"></a>"
69+
]
70+
},
6371
{
6472
"cell_type": "code",
6573
"execution_count": 1,
@@ -136,7 +144,6 @@
136144
"source": [
137145
"import torch\n",
138146
"import torch.nn as nn\n",
139-
"import torch.nn.functional as F\n",
140147
"\n",
141148
"\n",
142149
"class FeedForward(nn.Module):\n",

ch05/16_qwen3.5/tests/qwen3_5_layer_debugger.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ def _hf_config_from_dict(cfg):
102102
return hf_cfg
103103

104104

105-
def load_notebook_defs(nb_name="standalone-qwen3.5.ipynb"):
105+
def load_notebook_defs(nb_name="qwen3.5.ipynb"):
106106
nb_dir = Path(__file__).resolve().parents[1]
107107
if str(nb_dir) not in sys.path:
108108
sys.path.insert(0, str(nb_dir))

ch05/16_qwen3.5/tests/test_qwen3_5_nb.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ def import_notebook_defs():
4444
nb_dir = Path(__file__).resolve().parents[1]
4545
if str(nb_dir) not in sys.path:
4646
sys.path.insert(0, str(nb_dir))
47-
mod = import_definitions_from_notebook(nb_dir, "standalone-qwen3.5.ipynb")
47+
48+
mod = import_definitions_from_notebook(nb_dir, "qwen3.5.ipynb")
4849
return mod
4950

5051

0 commit comments

Comments
 (0)