Skip to content

Commit 21eab1a

Browse files
authored
Merge pull request #120 from NYU-RTS/cleanup
genai docs: minor cleanup
2 parents 6199194 + 7fad0f1 commit 21eab1a

File tree

2 files changed

+14
-9
lines changed

2 files changed

+14
-9
lines changed

docs/genai/04_how_to_guides/02_embeddings.mdx

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,19 @@
1-
# Embeddings
1+
# Generating embeddings
22

3-
While Decoder-only LLMs gained massive popularity via their usage in chatbots, Encoder-only LLMs can be used for a wider variety of tasks. Decoder-only LLMs "generate" tokens ("text") one at a time probabalisticsally. Encoder-only LLMs on the other hand take text as their input, tokenize it and generate "embeddings" as their output. Here, we shall walk through a task of generating embeddings from a text document.
3+
While Decoder-only LLMs gained massive popularity via their usage in chatbots, Encoder-only LLMs can be used for a wider variety of tasks. Decoder-only LLMs "generate" tokens ("text") one at a time probabalisticsally. Encoder-only LLMs on the other hand take text as their input, tokenize it and generate "embeddings" as their output. Here, we shall walk through a task of generating embeddings from a text snippet.
44

55
```mermaid
66
flowchart LR;
7-
A["natual language text string <br> *GenAI can be used for research*"]
7+
A["natual language text: <br> *GenAI can be used for research*"]
88
B["encoder-only LLM"]
9-
C["vector embedding <br> [0.052587852, 0.094195396, 0.24439038, 0.104940414, ...]"]
9+
C["vector embedding <br> [0.052, 0.094, 0.244, ...]"]
1010
A-- "Input" -->B;
1111
B-- "Output" -->C;
1212
```
1313

14-
## How to generate embeddings from plain text:
14+
:::tip
15+
Embeddings have the ability to encode the semantic meaning of the natual language text/images!
16+
:::
1517

1618
The snippet below uses the `text-embedding-3-small` model to create 32-dimensional floating point vector embeddings for the input string:
1719

@@ -41,8 +43,11 @@ and gives the following response:
4143

4244
## Applications of embeddings
4345

44-
Embeddings have the ability to encode the semantic meaning of the text. Thus, they find applications in:
46+
Embeddings are typically used for:
4547
- retrieval-augmented generation
4648
- search
4749
- classification
48-
among others
50+
51+
:::info
52+
Embeddings are typically stored in a *vector* database which is designed for efficient storage and fast retrieval of vectors.
53+
:::

docs/genai/04_how_to_guides/03_retrieval_augmented_generation.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,9 @@ flowchart TB;
3939
```
4040

4141
It starts with the "Ingestion" phase where a document to be used as context is parsed and broken into chunks. These chunks are then converted to embeddings and stored in a vector database (which specializes in storing and retrieving vectors). This setup allows us now "retrieve" the required context for an incoming prompt before it is sent to an LLM. The "retrieval" phase consists of converting the prompt to an embedding and looking up embeddings for chunks of the document that are similar to it. The text chunks associated with the embeddings similar to the embedding for the query are then added as additional context to the prompt before passing it to an LLM.
42-
The LLM now has the associated context it needs to generate an relevant response to the prompt.
42+
The LLM now has the associated context it needs to generate an relevant response to the prompt. Here's a link to a script to test this out yourself, once you have API access to an embedding model and an LLM: https://github.com/NYU-RTS/rts-docs-examples/tree/main/genai/rag .
4343

44-
Here's a script to test this out yourself, once you have API access to an embedding model and an LLM: https://github.com/NYU-RTS/rts-docs-examples/tree/main/genai/rag . You can run it to ask a question about a recent event that occurred after the knowledge cutoff for the dataset used to train the LLM:
44+
You can run it to ask a question about a recent event that occurred after the knowledge cutoff for the dataset used to train the LLM:
4545
```sh
4646
ss19980@ITS-JQKQGQQMTX ~/D/p/r/g/rag (main)> uv run rag_basic.py \
4747
https://en.wikipedia.org/wiki/2025_London_Marathon \

0 commit comments

Comments
 (0)