Add new summaries

github-actions[bot] · github-actions[bot] · commit 8fac45cfc79f · 2025-06-06T02:02:06.000Z
diff --git a/_posts/202506/2025-06-06-what-is-nlweb.md b/_posts/202506/2025-06-06-what-is-nlweb.md
@@ -0,0 +1,101 @@
+---
+layout: post
+---
+# What is NLWeb?
+- URL: [原文](https://glama.ai/blog/2025-06-01-what-is-nlweb)
+- Added At: 2025-06-06 02:01:28
+- [Link To Text](_posts/2025-06-06-what-is-nlweb_raw.md)
+
+## TL;DR
+NLWeb是由微软开源的协议，旨在通过利用Schema.org结构化数据和模型上下文协议(MCP)，为网站添加对话式AI接口。它通过多个LLM调用来处理查询，提供更精确的答案，并解决传统搜索的不足。虽然快速原型易于搭建，但大规模生产部署，优化成本以及维护数据新鲜度等方面仍面临挑战。对于已有结构化数据并寻求对话式搜索和AI代理访问的用户，NLWeb值得关注。
+
+
+## Summary
+1.  **NLWeb定义**: Microsoft开源的NLWeb是一种为网站添加对话式接口的协议，利用Schema.org结构化数据并支持MCP（模型上下文协议），促进人机和机机通信。
+    *   **关键理念**: 将任何网站转变为人类和AI agent都能自然查询的对话式界面。
+
+2.  **问题解决**: NLWeb旨在解决当前网站结构化数据缺乏标准AI访问方式的问题，并改进传统搜索在上下文感知和多轮查询方面的不足。
+    *   **类比**: 类似于RSS对内容聚合的贡献，NLWeb为AI交互提供了一个标准协议。
+    *   **核心优势**: 利用LLM对Schema.org的理解，快速创建对话式接口。
+
+3.  **工作原理**: NLWeb采用双组件系统和查询处理流程。
+    *   **双组件系统**:
+        - **协议层**: REST API ( `/ask` endpoint) 和 MCP server (`/mcp` endpoint)，接受自然语言查询并返回 Schema.org JSON 响应。
+        - **实现层**: 参考实现，编排多个LLM调用以进行查询处理。
+    *   **查询处理流程**:
+        User Query → Parallel Pre-processing → Vector Retrieval → LLM Ranking → Response (包含 Relevancy Check, Decontextualization, Memory Detection) → Fast Track Path
+    *   **LLM调用**: 单个查询可能触发50+个定向LLM调用，用于查询去语境化、相关性评分、结果排序等。
+    *   **快速通道优化**: 在预处理期间并行启动检索，但结果会阻塞直到完成相关性检查。
+
+4.  **多LLM调用**: NLWeb将查询分解为多个小型、具体的LLM调用，而非一个大型提示。
+    *   **示例问题**:
+        - "这个查询是关于菜谱的吗？"
+        - "它是否引用了之前提到的内容？"
+        - "用户是否要求记住饮食偏好？"
+        - "这个特定结果的相关性如何？"
+    *   **主要优点**:
+        -   **无幻觉**: 结果仅来自实际数据库。
+        -   **更高精度**: 每个LLM调用都有一个清晰的任务可以做好。
+
+5.  **快速上手**: Microsoft提供了一个快速入门指南，用于设置带有Behind The Tech RSS feed的NLWeb服务器。
+    *   **步骤示例**:
+        ```bash
+        git clone https://github.com/microsoft/NLWeb
+        cd NLWeb
+        python -m venv myenv
+        source myenv/bin/activate
+        cd code
+        pip install -r requirements.txt
+        # Configure (copy .env.template → .env, update API keys)
+        # Load data
+        python -m tools.db_load https://feeds.libsyn.com/121695/rss Behind-the-Tech
+        # Run
+        python app-file.py
+        ```
+    *   **验证**: 访问 `localhost:8000` 检查 NLWeb server 是否工作。
+    *   **CLI工具**: 仓库包含一个CLI工具来简化配置、测试和执行，但作者未能成功使用。
+
+6.  **Glama NLWeb服务器**: 作者构建了一个简单的NLWeb服务器，可以使用它来查询MCP服务器目录。
+    *   **示例请求**:
+        ```bash
+        curl -X POST https://glama.ai/nlweb/ask \
+        -H "Content-Type: application/json" \
+        -d '{"query": "MCP servers for working with GitHub"}'
+        ```
+    *   **其他功能**: 能够继续对话，总结或生成结果。
+    *   **实施便利**: 已经存在MCP服务器的嵌入和向量存储，并且有调用LLM的方式。
+
+7.  **REST API**: NLWeb在`/ask`和`/mcp`端点支持两个API，参数基本相同。
+    *   **/mcp端点**: 返回MCP客户端可以使用的格式，并支持核心MCP方法。
+    *   **/ask端点**:
+        -   **参数**:
+            -   `query`: 自然语言问题
+            -   `site`: 限定到特定数据子集
+            -   `prev`: 逗号分隔的先前查询
+            -   `decontextualized_query`: 如果提供，则跳过去语境化
+            -   `streaming`: 启用SSE流式传输
+            -   `query_id`: 跟踪对话
+            -   `mode`: `list`，`summarize`或`generate`
+
+8.  **MCP集成**: NLWeb默认包含一个MCP服务器，可以配置Claude for Desktop与之通信。
+    *   **配置步骤**: 将相关配置添加到`claude_desktop_config.json`。
+
+9.  **实施现实**: 拥有Schema.org markup或RSS feed可以快速运行一个基本原型。
+    *   **容易实现的**:
+        -   加载RSS feeds或Schema.org数据
+        -   使用提供的提示的基本搜索功能
+        -   使用Qdrant进行本地开发
+    *   **需要更多努力的**:
+        -   大规模生产部署
+        -   优化每个查询的50多个LLM调用
+        -   针对您的域的自定义提示工程
+        -   维护向量存储和实时数据之间的数据新鲜度
+    *   **成本考量**: 需要考虑LLM调用的成本。
+
+10. **是否应该关注**:
+    *   **应该**: 已经有结构化数据，需要超越关键词的对话式搜索，需要通过MCP进行程序化AI代理访问，以及可以试验早期阶段的技术。
+    *   **不应该**: 需要经过实战检验的生产代码，无法承担高昂的LLM API成本，内容结构不佳，以及期望即插即用的简易性。
+
+11. **总结**: NLWeb作为战略方向比作为当前技术更有意义，由Schema.org、RSS和RDF的创建者R.V. Guha开发，证明了其可行性，但从原型到生产仍需努力。
+
+
diff --git a/_posts/202506/2025-06-06-what-is-nlweb_raw.md b/_posts/202506/2025-06-06-what-is-nlweb_raw.md
@@ -0,0 +1,218 @@
+---
+layout: post
+---
+Title: What is NLWeb?
+
+URL Source: https://glama.ai/blog/2025-06-01-what-is-nlweb
+
+Markdown Content:
+Microsoft recently open-sourced [NLWeb](https://github.com/microsoft/NLWeb), a protocol for adding conversational interfaces to websites.[1](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fn-1) It leverages [Schema.org](https://schema.org/) structured data that many sites already have and includes built-in support for MCP (Model Context Protocol), enabling both human conversations and agent-to-agent communication.
+
+**The key idea:** NLWeb creates a standard protocol that turns any website into a conversational interface that both humans and AI agents can query naturally.
+
+**Want to try NLWeb?**
+
+As part of writing this post, I've implemented NLWeb search for MCP servers. Refer to the "Glama NLWeb Server" section for examples.
+
+What Problem Does NLWeb Solve?
+------------------------------
+
+Currently, websites have structured data (Schema.org) but no standard way for AI agents or conversational interfaces to access it. Every implementation is bespoke. Traditional search interfaces struggle with context-aware, multi-turn queries.
+
+NLWeb creates a standard protocol for conversational access to web content. Like RSS did for syndication, NLWeb does for AI interactions - one implementation serves both human chat interfaces and programmatic agent access.
+
+The key insight: Instead of building custom NLP for every site, NLWeb leverages LLMs' existing understanding of Schema.org to create instant conversational interfaces.
+
+The real power comes from multi-turn conversations that preserve context:
+
+1.   "Find recipes for dinner parties"
+2.   "Only vegetarian options"
+3.   "That can be prepared in under an hour"
+
+Each query builds on the previous context - something traditional search interfaces struggle with.
+
+How NLWeb Works
+---------------
+
+### Two-Component System
+
+1.   **Protocol Layer**: REST API (`/ask` endpoint) and MCP server (`/mcp` endpoint) that accept natural language queries and return Schema.org JSON responses
+2.   **Implementation Layer**: Reference implementation that orchestrates multiple LLM calls for query processing
+
+### Query Processing Pipeline
+
+User Query → Parallel Pre-processing → Vector Retrieval → LLM Ranking → Response ├─ Relevancy Check ├─ Decontextualization ├─ Memory Detection └─ Fast Track Path
+
+In this flow, a single query may trigger 50+ targeted LLM calls for:
+
+*   Query decontextualization based on conversation history
+*   Relevancy scoring against site content
+*   Result ranking with custom prompts per content type
+*   Optional post-processing (summarization/generation)
+
+The "fast track" optimization launches a parallel path to retrieval (step 3) while pre-processing occurs, but results are blocked until relevancy checks complete[2](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fn-2).
+
+### Video Explanation
+
+After I wrote this article, I was sent this video which includes a great introduction to NLWeb by R.V. Guha (creator of Schema.org, RSS, and RDF):
+
+### Why 50+ LLM Calls?
+
+Instead of using one large prompt to handle everything, NLWeb breaks each query into dozens of small, specific questions:
+
+*   "Is this query about recipes?"
+*   "Does it reference something mentioned earlier?"
+*   "Is the user asking to remember dietary preferences?"
+*   "How relevant is this specific result?"
+
+This approach has two major benefits:
+
+1.   **No hallucination** - Results only come from your actual database
+2.   **Better accuracy** - Each LLM call has one clear job it can do well
+
+Think of it like having a team of specialists instead of one generalist.
+
+Even if you don't use NLWeb, this pattern—using many focused LLM calls instead of one complex prompt—is worth borrowing.
+
+Quick Start
+-----------
+
+The best way to wrap your head around NLWeb is to try it out.
+
+Microsoft provides a [quick start guide](https://github.com/microsoft/NLWeb/blob/main/docs/nlweb-hello-world.md) for setting up an example NLWeb server with [Behind The Tech](https://www.microsoft.com/en-us/behind-the-tech) RSS feed.
+
+# Setup git clone https://github.com/microsoft/NLWeb cd NLWeb python -m venv myenv source myenv/bin/activate cd code pip install -r requirements.txt # Configure (copy .env.template → .env, update API keys) # Load data python -m tools.db_load https://feeds.libsyn.com/121695/rss Behind-the-Tech # Run python app-file.py
+
+Go to [localhost:8000](http://localhost:8000/) and you should have a working NLWeb server.
+
+I have also noticed that the repository contains a [CLI](https://github.com/microsoft/NLWeb/blob/main/docs/nlweb-cli.md) to simplify configuration, testing, and execution of the application. However, I struggled to get it working.
+
+Once you have the server running, you can ask it questions like:
+
+curl -X POST http://localhost:8000/ask \ -H "Content-Type: application/json" \ -d '{ "query": "tell me more about the first one", "prev": "find podcasts about AI,what topics do they cover" }'
+
+which will return a JSON response like:
+
+{ "query_id": "abc123", "results": [{ "url": "https://...", "name": "AI Safety with Stuart Russell", "score": 85, "description": "Discussion on alignment challenges...", "schema_object": { "@type": "PodcastEpisode", ... } }] }
+
+### Glama NLWeb Server
+
+As part of writing this post, I've built a simple NLWeb server using Node.js. You can use it to query our [MCP server](https://glama.ai/mcp/servers) directory:
+
+curl -X POST https://glama.ai/nlweb/ask \ -H "Content-Type: application/json" \ -d '{"query": "MCP servers for working with GitHub"}'
+
+As far as I can tell, this is the first ever public NLWeb endpoint!
+
+Due to the volume of LLM calls, it takes a few seconds to respond.
+
+or, if you want to continue the conversation:
+
+curl -X POST https://glama.ai/nlweb/ask \ -H "Content-Type: application/json" \ -d '{ "query": "servers that can create PRs", "prev": "MCP servers for working with GitHub" }'
+
+or, if you want to summarize the results:
+
+curl -X POST https://glama.ai/nlweb/ask \ -H "Content-Type: application/json" \ -d '{ "query": "MCP servers for working with GitHub", "mode": "summarize" }'
+
+Useful when you want an overview rather than just a list of results.
+
+or, if you want to generate a response:
+
+curl -X POST https://glama.ai/nlweb/ask \ -H "Content-Type: application/json" \ -d '{ "query": "MCP servers for working with GitHub", "mode": "generate" }'
+
+This mode attempts to answer the question using the retrieved results (like traditional RAG)
+
+Things that made it easy to implement:
+
+*   We have existing embeddings for every MCP server and a vector store
+*   We already have a way to make LLM calls
+
+A few questions came to mind as I was implementing this:
+
+*   It seems that NLWeb doesn't dictate where the `/ask` endpoint needs to be hosted—does it have to be `https://glama.ai/ask` or can it be `https://glama.ai/nlweb/ask`?
+*   It wasn't super clear to me which Schema.org data is best suited to describe MCP servers.
+
+Not surprisingly, the slowest part of the pipeline is the LLM calls.
+
+REST API
+--------
+
+Currently, NLWeb supports two APIs at the endpoints `/ask` and `/mcp`. The arguments are the same for both, as is most of the functionality. The `/mcp` endpoint returns the answers in format that MCP clients can use. The `/mcp` endpoint also supports the core MCP methods (`list_tools`, `list_prompts`, `call_tool` and `get_prompt`).
+
+The `/ask` endpoint supports the following parameters:
+
+| Parameter | Type | Description |
+| --- | --- | --- |
+| `query` | `string` | Natural language question |
+| `site` | `string` | Scope to specific data subset |
+| `prev` | `string` | Comma-separated previous queries |
+| `decontextualized_query` | `string` | Skip decontextualization if provided |
+| `streaming` | `bool` | Enable SSE streaming |
+| `query_id` | `string` | Track conversation |
+| `mode` | `string` | `list`, `summarize`, or `generate` |
+
+Integrating with MCP
+--------------------
+
+Since NLWeb includes an MCP server by default, you can configure Claude for Desktop to talk to NLWeb.
+
+If you already have the NLWeb server running, this should be as simple as adding the following to your `~/Library/Application Support/Claude/claude_desktop_config.json` configuration:
+
+{ "mcpServers": { "ask_nlw": { "command": "/Users/yourname/NLWeb/myenv/bin/python", "args": [ "/Users/yourname/NLWeb/code/chatbot_interface.py", "--server", "http://localhost:8000", "--endpoint", "/mcp" ], "cwd": "/Users/yourname/NLWeb/code" } } }
+
+Implementation Reality
+----------------------
+
+The documentation suggests you can get a basic prototype running quickly if you have existing Schema.org markup or RSS feeds.
+
+**What's actually straightforward:**
+
+*   Loading RSS feeds or Schema.org data
+*   Basic search functionality with provided prompts
+*   Local development with Qdrant
+
+**What requires more effort:**
+
+*   Production deployment at scale
+*   Optimizing 50+ LLM calls per query (mentioned in docs)
+*   Custom prompt engineering for your domain
+*   Maintaining data freshness between vector store and live data
+
+I already had a lot of these components in place, so I was able to get a basic prototype running in an hour. However, to make this production-ready, I'd need to think a lot more time about the cost of the LLM calls.
+
+Should You Care?
+----------------
+
+**Yes if:**
+
+*   You have structured data (Schema.org, RSS) already
+*   You want to enable conversational search beyond keywords
+*   You need programmatic AI agent access via MCP
+*   You can experiment with early-stage tech
+
+**No if:**
+
+*   You need battle-tested production code
+*   You can't handle significant LLM API costs
+*   Your content isn't well-structured
+*   You expect plug-and-play simplicity
+
+Bottom Line
+-----------
+
+NLWeb is more interesting as a strategic direction than as current technology. NLWeb was conceived and developed by R.V. Guha (creator of Schema.org, RSS, and RDF), now a CVP and Technical Fellow at Microsoft[3](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fn-3). That's serious pedigree.
+
+The O'Reilly prototype proves it's viable for content-heavy sites. The quick start shows it's approachable for developers. But "prototype in days" doesn't mean "production in weeks."
+
+Think of it as an investment in making your content natively conversational. The technical foundation is solid—REST API, standard formats, proven vector stores. The vision is compelling. The code needs work.
+
+Want to experiment? Clone the repo and try the quick start above.
+
+Footnotes
+---------
+
+1.   [https://news.microsoft.com/source/features/company-news/introducing-nlweb-bringing-conversational-interfaces-directly-to-the-web/](https://news.microsoft.com/source/features/company-news/introducing-nlweb-bringing-conversational-interfaces-directly-to-the-web/)[↩](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fnref-1)
+
+2.   [https://github.com/microsoft/NLWeb](https://github.com/microsoft/NLWeb)[↩](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fnref-2)
+
+3.   [https://techcommunity.microsoft.com/blog/azure-ai-services-blog/nlweb-pioneer-qa-oreilly/4415299](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/nlweb-pioneer-qa-oreilly/4415299)[↩](https://glama.ai/blog/2025-06-01-what-is-nlweb#user-content-fnref-3)
+
diff --git a/data.json b/data.json
@@ -370,5 +370,11 @@
     "title": "[ On | No ] syntactic support for error handling - The Go Programming Language",
     "url": "https://go.dev/blog/error-syntax",
     "timestamp": 1749175237
+  },
+  {
+    "month": "202506",
+    "title": "What is NLWeb?",
+    "url": "https://glama.ai/blog/2025-06-01-what-is-nlweb",
+    "timestamp": 1749175326
   }
 ]
diff --git a/summary.md b/summary.md
@@ -2,6 +2,7 @@
 读取 bookmark-collection 中的书签，使用 jina reader 获取文本内容，然后使用 LLM 总结文本。详细实现请参见 process_changes.py。需要和 bookmark-collection 中的 Github Action 一起使用。
     
 ## Summarized Bookmarks
+- (2025-06-06) [What is NLWeb?](_posts/202506/2025-06-06-what-is-nlweb.md)
 - (2025-06-06) [[ On | No ] syntactic support for error handling - The Go Programming Language](_posts/202506/2025-06-06-%5B-on-no-%5D-syntactic-support-for-error-handling---the-go-programming-language.md)
 - (2025-06-05) [利用GitHub Actions自动对仓库内图片进行无损压缩](_posts/202506/2025-06-05-%E5%88%A9%E7%94%A8github-actions%E8%87%AA%E5%8A%A8%E5%AF%B9%E4%BB%93%E5%BA%93%E5%86%85%E5%9B%BE%E7%89%87%E8%BF%9B%E8%A1%8C%E6%97%A0%E6%8D%9F%E5%8E%8B%E7%BC%A9.md)
 - (2025-06-04) [Hugo使用Shortcode插入bilibili、Youtube视频](_posts/202506/2025-06-04-hugo%E4%BD%BF%E7%94%A8shortcode%E6%8F%92%E5%85%A5bilibili%E3%80%81youtube%E8%A7%86%E9%A2%91.md)

Original file line number	Diff line number	Diff line change
`@@ -370,5 +370,11 @@`
`370`	`370`	`"title": "[ On \| No ] syntactic support for error handling - The Go Programming Language",`
`371`	`371`	`"url": "https://go.dev/blog/error-syntax",`
`372`	`372`	`"timestamp": 1749175237`
	`373`	`+ },`
	`374`	`+ {`
	`375`	`+ "month": "202506",`
	`376`	`+ "title": "What is NLWeb?",`
	`377`	`+ "url": "https://glama.ai/blog/2025-06-01-what-is-nlweb",`
	`378`	`+ "timestamp": 1749175326`
`373`	`379`	`}`
`374`	`380`	`]`