|
| 1 | +--- |
| 2 | +title: "Agno" |
| 3 | +sidebarTitle: "Agno" |
| 4 | +description: "Build a search assistant using the Agno agent framework with LanceDB as the knowledge backend." |
| 5 | +--- |
| 6 | + |
| 7 | +import { |
| 8 | + PyFrameworksAgnoAgent, |
| 9 | + PyFrameworksAgnoCliChat, |
| 10 | + PyFrameworksAgnoIngestYoutube, |
| 11 | + PyFrameworksAgnoSetup, |
| 12 | +} from '/snippets/integrations.mdx'; |
| 13 | + |
| 14 | +[Agno](https://docs.agno.com/introduction) is a framework for building agentic AI applications. |
| 15 | +It supports LanceDB as a knowledge backend, allowing you to easily ingest and retrieve external content for your agents. |
| 16 | + |
| 17 | +When you pair Agno's `Knowledge` system with LanceDB, you get a clean Agentic RAG setup. |
| 18 | +We'll walk through the steps below to build a YouTube transcript-aware Agno assistant that can: |
| 19 | +- Ingest a transcript from a YouTube video via the YouTube API |
| 20 | +- Store embeddings and metadata in LanceDB |
| 21 | +- Retrieve context during responses with hybrid search |
| 22 | +- Ask questions about the video content in a CLI chat loop |
| 23 | + |
| 24 | +## Prerequisites |
| 25 | + |
| 26 | +Install dependencies: |
| 27 | + |
| 28 | +<CodeGroup> |
| 29 | +```bash pip icon="terminal" |
| 30 | +pip install -U agno openai lancedb youtube-transcript-api beautifulsoup4 |
| 31 | +``` |
| 32 | + |
| 33 | +```bash uv icon="terminal" |
| 34 | +uv add agno openai lancedb youtube-transcript-api beautifulsoup4 |
| 35 | +``` |
| 36 | +</CodeGroup> |
| 37 | + |
| 38 | +## Step 1: Configure LanceDB-backed knowledge |
| 39 | + |
| 40 | +First, you can initialize the core `Knowledge` object that your agent will use for retrieval. |
| 41 | +It configures LanceDB as the vector store, enables hybrid search with native LanceDB FTS, and sets the embedding model. |
| 42 | + |
| 43 | +<CodeBlock filename="Python" language="Python" icon="python"> |
| 44 | + {PyFrameworksAgnoSetup} |
| 45 | +</CodeBlock> |
| 46 | + |
| 47 | +## Step 2: Fetch and ingest the YouTube transcript |
| 48 | + |
| 49 | +Next, extract a YouTube video ID, fetch the full transcript, and flatten it into text for indexing. |
| 50 | +The snippet shown below then inserts that transcript text into the Agno knowledge base, which writes vectors and metadata to LanceDB. |
| 51 | + |
| 52 | +<CodeBlock filename="Python" language="Python" icon="python"> |
| 53 | + {PyFrameworksAgnoIngestYoutube} |
| 54 | +</CodeBlock> |
| 55 | + |
| 56 | +<Info> |
| 57 | +This path explicitly fetches the transcript first, then inserts transcript text into LanceDB through Agno. |
| 58 | +</Info> |
| 59 | + |
| 60 | +## Step 3: Create an Agno agent with knowledge search |
| 61 | + |
| 62 | +The next step is to construct an Agno `Agent` and attach the knowledge base you just populated. |
| 63 | +With `search_knowledge=True`, the agent performs retrieval before answering, so responses stay grounded in transcript context. |
| 64 | + |
| 65 | +In Agno, retrieval is exposed as a tool call that the model can invoke at runtime. |
| 66 | +When `search_knowledge=True`, Agno makes a knowledge-search tool (shown in output as `search_knowledge_base(...)`) available to the model; the model decides when to call it, Agno executes the tool, and the returned context is fed back into the final answer. |
| 67 | + |
| 68 | +<CodeBlock filename="Python" language="Python" icon="python"> |
| 69 | + {PyFrameworksAgnoAgent} |
| 70 | +</CodeBlock> |
| 71 | + |
| 72 | +## Step 4: Start a CLI chat loop |
| 73 | + |
| 74 | +You can now ask an initial question and then start an interactive loop for follow-up queries. |
| 75 | +Each prompt runs through the same retrieval pipeline, so you can iteratively inspect what the transcript contains. |
| 76 | + |
| 77 | +<CodeBlock filename="Python" language="Python" icon="python"> |
| 78 | + {PyFrameworksAgnoCliChat} |
| 79 | +</CodeBlock> |
| 80 | + |
| 81 | +<Info> |
| 82 | +Want local-first inference? Replace OpenAI model/embedder classes with Agno's Ollama providers. See Agno's Ollama knowledge examples: [docs.agno.com/examples/models/ollama/chat/knowledge](https://docs.agno.com/examples/models/ollama/chat/knowledge). |
| 83 | +</Info> |
| 84 | + |
| 85 | +### Question 1 |
| 86 | + |
| 87 | +The following question is asked in the CLI chat loop: |
| 88 | +``` |
| 89 | +┏━ Message ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 90 | +┃ ┃ |
| 91 | +┃ Q: What kinds of data can LanceDB handle? ┃ |
| 92 | +┃ ┃ |
| 93 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 94 | +┏━ Tool Calls ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 95 | +┃ ┃ |
| 96 | +┃ • search_knowledge_base(query=What kinds of data can LanceDB handle?) ┃ |
| 97 | +┃ • search_knowledge_base(query=LanceDB images audio video handle kinds of data ┃ |
| 98 | +┃ can handle 'LanceDB can handle' 'kinds of data' 'images audio video' transcript) ┃ |
| 99 | +┃ ┃ |
| 100 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 101 | +┏━ Response (19.1s) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 102 | +┃ ┃ |
| 103 | +┃ ┃ |
| 104 | +┃ • Images, audio, video — i.e., multimodal AI data and “all manners of things ┃ |
| 105 | +┃ you don't put into traditional databases” (per the transcript). ┃ |
| 106 | +┃ ┃ |
| 107 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 108 | +``` |
| 109 | + |
| 110 | +We get the response based on the transcript's contents as expected. |
| 111 | + |
| 112 | +### Question 2 |
| 113 | + |
| 114 | +Let's ask a more specific question about the CEO of LanceDB, which is also in the transcript: |
| 115 | + |
| 116 | +``` |
| 117 | +You: What is the name of the CEO of LanceDB? |
| 118 | +INFO Found 10 documents |
| 119 | +┏━ Message ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 120 | +┃ ┃ |
| 121 | +┃ What is the name of the CEO of LanceDB? ┃ |
| 122 | +┃ ┃ |
| 123 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 124 | +┏━ Tool Calls ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 125 | +┃ ┃ |
| 126 | +┃ • search_knowledge_base(query=CEO of LanceDB) ┃ |
| 127 | +┃ ┃ |
| 128 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 129 | +┏━ Response (16.7s) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ |
| 130 | +┃ ┃ |
| 131 | +┃ ┃ |
| 132 | +┃ • According to the retrieved YouTube transcript/title, the CEO of LanceDB is ┃ |
| 133 | +┃ Chang She. ┃ |
| 134 | +┃ ┃ |
| 135 | +┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ |
| 136 | +``` |
| 137 | + |
| 138 | +We get the response based on the transcript's contents and title as expected. |
| 139 | + |
| 140 | +## Why this works well |
| 141 | + |
| 142 | +To start, LanceDB OSS can run from a local directory, so transcript data can stay on your machine when you are using the OSS stack. |
| 143 | + |
| 144 | +- You do not need to maintain a separate transcript parser in your application code. |
| 145 | +- You do not need to hand-roll chunking and retrieval orchestration across multiple modules. |
| 146 | +- One explicit Agno `Knowledge` object, backed by LanceDB, defines both ingestion and search behavior in one place. |
| 147 | +- Fewer moving parts means the tutorial stays readable and the same pattern is easier to carry into production code. |
| 148 | + |
| 149 | +As your application needs grow, you can migrate to LanceDB [Enterprise](/enterprise) for |
| 150 | +convenience features like automatic compaction and reindexing and the ability to scale to |
| 151 | +really large datasets. |
0 commit comments