@@ -41,7 +41,7 @@ <h1 class="hero-title">Chatbot Evolution Timeline</h1>
4141 < span class ="era-label "> 2020< br > BlenderBot</ span >
4242 </ div >
4343 < div class ="era era-2020s " data-era ="2020s ">
44- < span class ="era-label "> 2025 < br > DeepSeek-R1 </ span >
44+ < span class ="era-label "> 2024 < br > Qwen2.5 </ span >
4545 </ div >
4646 </ div >
4747 </ section >
@@ -455,20 +455,20 @@ <h2>2020s: GPT & Transformers</h2>
455455
456456 < div class ="chatbot-info ">
457457 < div class ="info-card ">
458- < h3 > About DeepSeek-R1 </ h3 >
459- < p > < strong > Innovation:</ strong > Distilled reasoning from DeepSeek-R1 </ p >
460- < p > < strong > Method:</ strong > Decoder-only transformer, chain-of-thought </ p >
461- < p > < strong > Model :</ strong > DeepSeek-R1-Distill-Qwen-1.5B </ p >
462- < p > < strong > Benchmarks :</ strong > AIME 28.9%, MATH-500 83.9% </ p >
458+ < h3 > About Modern LLMs </ h3 >
459+ < p > < strong > Innovation:</ strong > Instruction-tuned transformers </ p >
460+ < p > < strong > Method:</ strong > Decoder-only transformer with chat templates </ p >
461+ < p > < strong > Models :</ strong > Qwen2.5 0.5B, SmolLM 360M </ p >
462+ < p > < strong > Context :</ strong > Multi-turn conversation support </ p >
463463 </ div >
464464
465465 < div class ="info-card ">
466466 < h3 > How It Works</ h3 >
467467 < ul >
468- < li > Distilled from larger reasoning model </ li >
469- < li > Chain-of-thought reasoning capabilities </ li >
470- < li > Strong math and coding performance </ li >
471- < li > Fallback: Gemma 3 1B/270M if needed </ li >
468+ < li > Pre-trained on vast text corpora </ li >
469+ < li > Fine-tuned for instruction following </ li >
470+ < li > System prompts guide behavior </ li >
471+ < li > Runs entirely in your browser (WASM/WebGPU) </ li >
472472 </ ul >
473473 </ div >
474474 </ div >
@@ -484,23 +484,30 @@ <h3>How It Works</h3>
484484 < div class ="chat-interface ">
485485 < div class ="model-loading-status hidden " id ="gpt-loading-status ">
486486 < div class ="loading-spinner "> </ div >
487- < div class ="loading-text "> Loading DeepSeek-R1 ...</ div >
487+ < div class ="loading-text "> Loading model ...</ div >
488488 < div class ="loading-progress " id ="gpt-progress "> Initializing...</ div >
489489 </ div >
490490 < div class ="chat-messages " id ="gpt-messages "> </ div >
491491 < div class ="chat-input-area ">
492- < input type ="text " class ="chat-input " id ="gpt-input " placeholder ="Talk to DeepSeek-R1... ">
492+ < div class ="model-selector-row ">
493+ < select id ="gpt-model-selector " class ="model-selector ">
494+ < option value ="0 "> Qwen 2.5 0.5B (Alibaba)</ option >
495+ < option value ="1 "> SmolLM 360M (HuggingFace)</ option >
496+ </ select >
497+ < button class ="btn-clear-chat " id ="gpt-clear-btn " title ="Clear chat history "> Clear</ button >
498+ </ div >
499+ < input type ="text " class ="chat-input " id ="gpt-input " placeholder ="Talk to the model... ">
493500 < button class ="chat-send " id ="gpt-send-btn " onclick ="sendMessage('gpt') "> Send</ button >
494501 </ div >
495- < p class ="demo-note "> Using DeepSeek-R1-Distill 1.5B. Fallback: Gemma 3 1B/270M. Loads on first message .</ p >
502+ < p class ="demo-note "> Select a model above. Conversation history is preserved. First message loads the model (~30s) .</ p >
496503 </ div >
497504 </ div >
498505
499506 <!-- Architecture Tab -->
500507 < div class ="chatbot-tab-content " id ="gpt-architecture-tab ">
501508 < div class ="architecture-content ">
502509 < div class ="architecture-diagram ">
503- < h4 > Decoder-Only Transformer (DeepSeek-R1 )</ h4 >
510+ < h4 > Decoder-Only Transformer (Qwen2.5 / SmolLM )</ h4 >
504511 < div class ="arch-flow ">
505512 < div class ="arch-block input-block ">
506513 < div class ="block-label "> Input Prompt</ div >
@@ -513,12 +520,12 @@ <h4>Decoder-Only Transformer (DeepSeek-R1)</h4>
513520 </ div >
514521 < div class ="arch-arrow "> ↓</ div >
515522 < div class ="arch-block decoder-only-block ">
516- < div class ="block-label "> DeepSeek-R1 Decoder Stack</ div >
523+ < div class ="block-label "> Transformer Decoder Stack</ div >
517524 < div class ="block-content ">
518525 < div class ="sub-block "> Grouped-Query Attention</ div >
519526 < div class ="sub-block "> SiLU FFN</ div >
520527 < div class ="sub-block "> RMSNorm</ div >
521- < div class ="block-note "> x28 layers</ div >
528+ < div class ="block-note "> x24 layers (Qwen) / x32 layers (SmolLM) </ div >
522529 </ div >
523530 </ div >
524531 < div class ="arch-arrow "> ↓</ div >
@@ -533,44 +540,40 @@ <h4>Decoder-Only Transformer (DeepSeek-R1)</h4>
533540 < h4 > Key Concepts</ h4 >
534541 < div class ="concept-grid ">
535542 < div class ="concept-card ">
536- < h5 > Knowledge Distillation </ h5 >
537- < p > DeepSeek-R1-Distill captures reasoning abilities from a much larger model, enabling strong performance at small size .</ p >
543+ < h5 > Instruction Tuning </ h5 >
544+ < p > Models are fine-tuned on instruction-response pairs, learning to follow user requests and generate helpful outputs .</ p >
538545 </ div >
539546 < div class ="concept-card ">
540- < h5 > Chain-of-Thought </ h5 >
541- < p > The model learned to break complex problems into steps, improving accuracy on math and coding tasks .</ p >
547+ < h5 > Chat Templates </ h5 >
548+ < p > System prompts and message formatting guide model behavior, enabling multi-turn conversations with context .</ p >
542549 </ div >
543550 < div class ="concept-card ">
544551 < h5 > Grouped-Query Attention</ h5 >
545- < p > GQA reduces memory usage by sharing key-value heads across query heads, enabling efficient inference.</ p >
552+ < p > GQA reduces memory usage by sharing key-value heads across query heads, enabling efficient browser inference.</ p >
546553 </ div >
547554 < div class ="concept-card ">
548- < h5 > RoPE Positions </ h5 >
549- < p > Rotary Position Embeddings encode position through rotation, enabling long context handling .</ p >
555+ < h5 > Quantization (q4) </ h5 >
556+ < p > 4-bit quantization shrinks model size by ~4x while preserving quality, essential for browser deployment .</ p >
550557 </ div >
551558 </ div >
552559 </ div >
553560
554561 < div class ="model-specs ">
555- < h4 > DeepSeek-R1-Distill-Qwen-1.5B Specifications </ h4 >
562+ < h4 > Available Models </ h4 >
556563 < table class ="specs-table ">
557- < tr > < td > Parameters</ td > < td > 1.5 Billion</ td > </ tr >
558- < tr > < td > Architecture</ td > < td > Decoder-Only Transformer</ td > </ tr >
559- < tr > < td > Layers</ td > < td > 28</ td > </ tr >
560- < tr > < td > Hidden Size</ td > < td > 1536</ td > </ tr >
561- < tr > < td > Attention Heads</ td > < td > 12</ td > </ tr >
562- < tr > < td > Context Length</ td > < td > 131,072 tokens</ td > </ tr >
563- < tr > < td > AIME 2024</ td > < td > 28.9%</ td > </ tr >
564- < tr > < td > MATH-500</ td > < td > 83.9%</ td > </ tr >
565- < tr > < td > LiveCodeBench</ td > < td > 16.9%</ td > </ tr >
566- < tr > < td > Year</ td > < td > 2025 (DeepSeek)</ td > </ tr >
564+ < tr > < th > Spec</ th > < th > Qwen 2.5 0.5B</ th > < th > SmolLM 360M</ th > </ tr >
565+ < tr > < td > Parameters</ td > < td > 500 Million</ td > < td > 360 Million</ td > </ tr >
566+ < tr > < td > Architecture</ td > < td > Decoder-Only</ td > < td > Decoder-Only</ td > </ tr >
567+ < tr > < td > Layers</ td > < td > 24</ td > < td > 32</ td > </ tr >
568+ < tr > < td > Hidden Size</ td > < td > 896</ td > < td > 960</ td > </ tr >
569+ < tr > < td > Context Length</ td > < td > 32,768 tokens</ td > < td > 2,048 tokens</ td > </ tr >
570+ < tr > < td > Organization</ td > < td > Alibaba</ td > < td > HuggingFace</ td > </ tr >
571+ < tr > < td > Year</ td > < td > 2024</ td > < td > 2024</ td > </ tr >
567572 </ table >
568573 < div class ="model-note ">
569- < strong > Fallback Chain:</ strong >
570- < a href ="https://huggingface.co/onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX " target ="_blank "> DeepSeek-R1 1.5B</ a > →
571- < a href ="https://huggingface.co/onnx-community/gemma-3-1b-it-ONNX " target ="_blank "> Gemma 3 1B</ a > →
572- < a href ="https://huggingface.co/onnx-community/gemma-3-270m-it-ONNX " target ="_blank "> Gemma 3 270M</ a > .
573- Uses largest successful model.
574+ < strong > Model Links:</ strong >
575+ < a href ="https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct " target ="_blank "> Qwen2.5-0.5B-Instruct</ a > |
576+ < a href ="https://huggingface.co/onnx-community/SmolLM-360M-Instruct " target ="_blank "> SmolLM-360M-Instruct</ a >
574577 </ div >
575578 </ div >
576579 </ div >
@@ -639,8 +642,8 @@ <h3>Evolution Stats</h3>
639642 < span class ="stat-value "> 90M params</ span >
640643 </ div >
641644 < div class ="stat-item ">
642- < span class ="stat-label "> DeepSeek-R1 (2025 )</ span >
643- < span class ="stat-value "> 1.5B params</ span >
645+ < span class ="stat-label "> Qwen 2.5 (2024 )</ span >
646+ < span class ="stat-value "> 500M params</ span >
644647 </ div >
645648 < div class ="stat-item ">
646649 < span class ="stat-label "> Claude 4 Opus (2025)</ span >
0 commit comments