Skip to content

Commit 7a7ffe7

Browse files
committed
Updated to Karpathy's vidoe notes (without Info folder)
1 parent 584a5ce commit 7a7ffe7

File tree

6 files changed

+137
-2
lines changed

6 files changed

+137
-2
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ scratch.md
66
Templates
77
Daily Notes
88
Projects
9+
Info
139 KB
Loading

docs/writing/index.md

Whitespace-only changes.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
2+
3+
# NTT data
4+
5+
## When was the last time you used this skill? - OpenAI
6+
- Developed an internal Q&A Chatbot using OpenAI's Chat models (GPT-3.5 and GPT-4) and Embedding Models, implementing a Retrieval-Augmented Generation (RAG) system for enhanced performance and accuracy.
7+
8+
- Designed and conducted evaluations of Open Source models using OpenAI's GPT-4 as a benchmark, providing valuable insights into the performance and capabilities of various models.
9+
10+
- Fine-tuned OpenAI's GPT-3.5 model using public datasets available on Huggingface, successfully reducing hallucinations and improving the model's overall reliability and coherence.
11+
12+
13+
## When was the last time you used this skill? - Language Model
14+
15+
Developed POC generative AI solutions, utilizing advanced prompt engineering techniques (Chain of Thought, ReAct) and fine-tuned models (Google's Text Bison and OpenAI 3.5) for improved performance and reduced hallucinations.
16+
Skilled in creating multimodal models, integrating LLMs with structured databases, and leveraging frameworks like Langchain, DSPy, Instructor, and Pydantic for building generative AI applications.
17+
Architected a system for generating personalized social media content using customer-specific data, and worked with in-memory and cloud vector databases for embedding management and similarity search.
18+
Actively contributed to open-source projects (Needle in Haystack analysis, Langchain, DSPy) and utilized DevOps platforms (Langsmith, phoenix-arize) for developing, testing, and deploying LLM applications.
19+
20+
## When was the last time you used this skill? - Tensorflow
21+
22+
Used keras to finetune and deploy smaller open-source model like gemma2b
23+
24+
25+
26+
## Speech to Text
27+
28+
Used Google speech-to-text service to create a transcription of videos
29+
30+
31+
## LLM
32+
33+
Developed POC generative AI solutions, utilizing advanced prompt engineering techniques (Chain of Thought, ReAct) and fine-tuned models (Google's Text Bison and OpenAI 3.5) for improved performance and reduced hallucinations.
34+
Skilled in creating multimodal models, integrating LLMs with structured databases, and leveraging frameworks like Langchain, DSPy, Instructor, and Pydantic for building generative AI applications.
35+
Architected a system for generating personalized social media content using customer-specific data, and worked with in-memory and cloud vector databases for embedding management and similarity search.
36+
Actively contributed to open-source projects (Needle in Haystack analysis, Langchain, DSPy) and utilized DevOps platforms (Langsmith, phoenix-arize) for developing, testing, and deploying LLM applications.
37+
38+
## NTLK
39+
40+
Sentimental Analysis on Customer Support Emails
41+
42+
## AI
43+
44+
Worked on Churn model to predict it 2-3 months before it happens and find the leading indicator that is causing the churn
45+
46+
47+
## Vector Database
48+
49+
Experienced in working with in-memory and cloud vector databases, such as Pinecone and Weaviate, for efficient embedding management and similarity search.
50+
- Utilized vector databases to support the development of LLM-based applications, enabling fast and accurate retrieval of relevant information for generating personalized content and insights.
51+
- Proficient in setting up schemas and leveraging advanced filtering techniques using metadata in Pinecone and Weaviate cloud databases, ensuring optimized performance and refined search results.
52+
- Implemented on-premise vector database solutions using Postgres with the pgvector extension for customers who prefer to keep their data in-house, adapting to their specific requirements and constraints.
53+
-Integrated vector databases with LLM frameworks, like Langchain and DSPy, to create end-to-end solutions that combine the power of language models with fast and accurate information retrieval.
54+
- Utilized Langchain Indexing for continuous embedding of documents into vector databases, enabling efficient and cost-effective embedding for Retrieval-Augmented Generation (RAG) systems, enhancing the quality and relevance of generated content.
55+
56+
## Langchain

docs/writing/posts/Karpathy's - let's build GPT from scratch.md

Lines changed: 80 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ authors:
2626

2727
Dataset: people names dataset in givernment website
2828

29-
## Iteration 1:
29+
### Iteration 1:
3030
Character level language model
3131
3232
Method: Bigram (Predict next char using previous char)
@@ -68,7 +68,7 @@ print(f'{nll/n=}')
6868

6969
To avoid infinity probability for some predictions, people do model "smoothing" (assigning very small probability to unlikely scenario)
7070

71-
## Iteration 2: Bigram Language Model using Neural Network
71+
### Iteration 2: Bigram Language Model using Neural Network
7272

7373
Need to create a dataset for training, i.e input and output char pair. (x and y).
7474

@@ -126,5 +126,83 @@ We ended up with the same model , in the NN based approach the `W` represents t
126126

127127

128128

129+
## [Building makemore Part 2: MLP - YouTube](https://www.youtube.com/watch?v=TCH_1BHY58I)
129130

131+
In this class we would build makemore to predict based on last 3 characters.
132+
133+
#### Embedding
134+
As a first step, we need to build embedding for the characters, we start with 2 dimensional embedding.
135+
136+
![[Pasted image 20250205123847.png]]
137+
138+
```python
139+
h = emb.view(-1, 6) @ W1 + b1 # Hiden layer activation
140+
```
141+
142+
We index on embedding matrix to get the weight / embeddings for the character. Another way to interpret is one hot encoding. indexing and one hot encoding produce similar result. in this case we think first layer as weight of neural network.
143+
144+
```python
145+
logits = h @ W2 + b2
146+
counts = logits.exp()
147+
prob = counts/counts.sum(1,keepdims=True)
148+
prob.shape
149+
# torch.Size([32, 27])
150+
```
151+
152+
In Final layer we get probability distribution for all 27 characters.
153+
154+
155+
```python
156+
# Negative Log likelihood
157+
158+
loss = -prob[torch.arange(32), Y].log().mean()
159+
loss
160+
```
161+
162+
In Practice, we use mini batch for forward or backward pass. it is efficient than optimizing on the entire dataset.
163+
164+
it is much efficient to take many steps (iteration) with low confidence in gradient
165+
166+
#### Learning rate
167+
168+
Learning rate is an important hyper , we need to find the reasonable range manually and we can use different techniques to search for the optimal parameter in that range.
169+
170+
#### Dataset split
171+
172+
Important to split dataset into three sets
173+
- train split is to find model parameters
174+
175+
- dev split is to find hyper parameters
176+
177+
- test split is to evaluate the model performance finally
178+
179+
we improve the model by increasing the complexity by increasing the parameters. for example hidden layer neurons can be increased.
180+
181+
182+
In our case , bottle neck may be the embeddings, we are cramping all the character in just two dimensional space. we can increase embedding dimensions to 10 from 2.
183+
184+
Now we get better name sounding words than before ( with just one character in context)
185+
186+
```
187+
dex.
188+
marial.
189+
mekiophity.
190+
nevonimitta.
191+
nolla.
192+
kyman.
193+
arreyzyne.
194+
javer.
195+
gota.
196+
mic.
197+
jenna.
198+
osie.
199+
tedo.
200+
kaley.
201+
mess.
202+
suhaiaviyny.
203+
fobs.
204+
mhiriel.
205+
vorreys.
206+
dasdro.
207+
```
130208

img/Pasted.md

Whitespace-only changes.

0 commit comments

Comments
 (0)