Skip to content

YashBhadange2006/GiPiTy-Shakespeare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

GiPiTy-Shakespeare

A lightweight character-level language model trained on the complete works of Shakespeare. This project implements a generative model using PyTorch, designed to demonstrate the fundamental architecture of sequence-based prediction and text generation.


Project Overview

This repository features a character-level model that learns statistical patterns within Shakespearean English. It treats individual characters as tokens, allowing it to learn vocabulary, grammar, and structural elements (like dialogue formatting) from the ground up.


Technical Specifications

  • Framework: PyTorch
  • Model Architecture: Character-level Generative Model
  • Dataset: Tiny Shakespeare (1.11 million characters)
  • Vocabulary Size: 65 unique characters
  • Tokenizer: Custom mapping (integer encoding/decoding)

Dataset Statistics

  • Total Character Count: 1,115,393
  • Data Splitting: 90% Training, 10% Validation
  • Tokenization: Manual character-to-index mapping (stoi / itos)

Model Performance

The following is a raw sample of text generated by the model after training. It illustrates how the model begins to approximate Shakespearean syntax and character cues:

NIOULELand ccetathe'd?
OMNEObean tithy,
K:
IZAUMalemagidwhars,
O, thyof d:
Jarshmerdof four sthe ha!
Thanuckis!

Sor;


Toul to raghoulis angragathn mioich gherif Viserow wian, angat frest msu sy se

adn ntingh be mere ED m be vewhe r whandr, ch m fltestiomeed ltheak nase owilg Whe pld nth be Wig blo, foolols,
Th, ve ate.
CESAyousand bas re hayowhaiso hesies ce w t hesue chy sonsgenomitheancodiner's pengheandsau p tcagefathid annesitond w, e,

STLInosagethoureangorilourbes grto T:

ARARI himowshisuret'sod wout hed.
Fo y anit!
Honw-teirefoise Bedy
Wht.
Hest f yofoou EWe berstewinaken'serur f uisteis thakir co,
CHenove
IUSou hos bbr fitlaraschenoum'd th thr
Hous! th ieawistoler t st

IICot nd ache bes!

Bonthesthe t ce alootor I to w follofr aed red, he wadfof t an y, bud tcheer medycou foutyo t hasce t thn,
Thendwenngana hangr, seite anst brovetheinthad wneqund dllint ad by anallour scand gehe icofeve ifen
GBUS:
ty thetind pl ir fe bet, pannds he ty tere ve geanougoou ar. veayondo thak

Project Files

  • GiPiTy.ipynb
    The primary Jupyter Notebook containing the data loading logic, model definition, training loop, and generation utility.

  • shakespeare.txt
    The source dataset used for model training and validation.


About

A mini-GPT implementation for character-level text generation. Trained on Shakespearean datasets to predict and generate text sequences.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors