Skip to content

Commit 6cb301e

Browse files
committed
Improve embeddings blog post
1 parent 1036001 commit 6cb301e

File tree

1 file changed

+12
-3
lines changed
  • src/content/blog/introduction-to-embeddings

1 file changed

+12
-3
lines changed

src/content/blog/introduction-to-embeddings/page.mdx

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ export const metadata = createMetadata({
1919
---
2020

2121
<Image src={metadata.image} alt="Introduction to vectors"/>
22+
<figcaption>An overview of vector embeddings.</figcaption>
2223

23-
## Table of contents
2424

2525
In the rapidly evolving world of artificial intelligence (AI) and machine learning, there's a concept that's revolutionizing the way machines understand and process data: embeddings. Embeddings, also known as vectors, are floating-point numerical representations of the "features" of a given piece of data. These powerful tools allow machines to achieve a granular "understanding" of the data we provide, enabling them to process and analyze it in ways that were previously impossible. In this comprehensive guide, we'll explore the basics of embeddings, their history, and how they're revolutionizing various fields.
2626

@@ -29,6 +29,7 @@ In the rapidly evolving world of artificial intelligence (AI) and machine learni
2929
At the heart of embeddings lies the process of feature extraction. When we talk about "features" in this context, we're referring to the key characteristics or attributes of the data that we want our machine learning models to learn and understand. For example, in the case of natural language data (like text), features might include the semantic meaning of words, the syntactic structure of sentences, or the overall sentiment of a document.
3030

3131
<Image src={featureExtraction} alt="Feature extraction"/>
32+
<figcaption>Neural models extract key features from data.</figcaption>
3233

3334
To obtain embeddings, you feed your data to an embedding model, which uses a neural network to extract these relevant features. The neural network learns to map the input data to a high-dimensional vector space, where each dimension represents a specific feature. The resulting vectors, or embeddings, capture the essential information about the input data in a compact, numerical format that machines can easily process and analyze.
3435

@@ -45,7 +46,7 @@ When we pass this string of natural language into an embedding model, the model
4546
[0.283939734973434, -0.119420836293, 0.0894208490832, ..., -0.20392492842, 0.1294809231993, 0.0329842098324]
4647
```
4748

48-
Each value in this vector is a floating-point number, typically ranging from -1 to 1. These numbers represent the presence or absence of specific features in the input data. For example, one dimension of the vector might correspond to the concept of "speed," while another might represent "animal." The embedding model learns to assign higher values to dimensions that are more strongly associated with the input data, and lower values to dimensions that are less relevant.
49+
Each value in this vector is a floating-point number, often between -1 and 1, though the exact range depends on the model. These numbers indicate how strongly the input exhibits certain learned features. In practice, individual dimensions rarely correspond to a single, human-readable concept. Instead, meaning is distributed across many dimensions, which together encode the semantics of the data. The embedding model learns to assign higher values to dimensions that are more strongly associated with the input data and lower values to those that are less relevant.
4950

5051
So, in our example, the embedding vector might have a high value in the "speed" dimension (capturing the concept of "quick"), a moderate value in the "animal" dimension (representing "fox" and "dog"), and relatively low values in dimensions that are less relevant to the input text (like "technology" or "politics").
5152

@@ -58,7 +59,14 @@ The true power of embeddings lies in their ability to capture complex relationsh
5859

5960
The potential applications of embeddings are vast and diverse, spanning across multiple domains and industries. Some of the most prominent areas where embeddings are making a significant impact include:
6061

61-
Natural Language Processing (NLP) In the field of NLP, embeddings have become an essential tool for a wide range of tasks, such as:
62+
* **Natural Language Processing (NLP)**
63+
* **Image and Video Analysis**
64+
* **Recommendation Systems**
65+
* **Anomaly Detection**
66+
67+
### Natural Language Processing (NLP)
68+
69+
Embeddings have become an essential tool for a wide range of tasks, such as:
6270

6371
### Text classification
6472

@@ -140,6 +148,7 @@ The concept of embeddings has its roots in the field of natural language process
140148
Word2Vec revolutionized NLP by demonstrating that neural networks could be trained to produce dense vector representations of words, capturing their semantic similarities and relationships in a highly efficient and scalable manner. The key insight behind Word2Vec was that the meaning of a word could be inferred from its context – that is, the words that typically appear around it in a sentence or document.
141149

142150
<Image src={neuralNetwork} alt="Neural network"/>
151+
<figcaption>A simplified neural network architecture.</figcaption>
143152

144153
By training a shallow neural network to predict the context words given a target word (or vice versa), Word2Vec was able to learn highly meaningful word embeddings that captured semantic relationships like synonymy, antonymy, and analogy. For example, the embedding for the word "king" would be more similar to the embedding for "queen" than to the embedding for "car," reflecting their semantic relatedness.
145154

0 commit comments

Comments
 (0)