Pixiu 2026 work plan #858

AlexStocks · 2026-01-09T06:59:54Z

AlexStocks
Jan 9, 2026
Collaborator

随着 1.1.0 的发布（https://mp.weixin.qq.com/s/u42e_NKe8T6ayhFaxHR48Q），Pixiu 还面临如下问题：

1 AI 能力需要继续推进
一个是AI推理面演进还缺的能力以及kv-cache卸载到offload需要的指标项, 一个是关于成本统计这块，目下各模型的统计规格都不一致，可以参考如下文档:
- 1.1 Dubbo-Go-Pixiu AI 推理建设详细落地方案 Dubbo-Go-Pixiu AI 推理建设详细落地方案 #859
- 1.2 LLM 推理 KV Cache 分布式缓存方案 LLM 推理 KV Cache 分布式缓存方案 #860
  - https://mp.weixin.qq.com/s/QJ_G4VLqYqhB_7Y4deVSYA 构建工程级 RAG Gateway：在 Dubbo-Go-Pixiu 中实现 RAG 支持的实践方案及技术路线
  - https://mp.weixin.qq.com/s/Ry2-k3N4CUCP4Azb0asoTA 构建高效无状态 AI 网关：Dubbo-Go-Pixiu 与 SGLang/vLLM 深度集成
    - https://mp.weixin.qq.com/s/M-6dZ53iK-0j92wkcjZHiw Dubbo-Go-Pixiu 2026 路线图：KVCache 与 LLM 推理加速
    - https://github.com/percent4/embedding_rerank_retrieval 针对RAG中的Retrieve阶段的召回技术及算法效果所做评估实验。使用主体框架为LlamaIndex.
    - https://github.com/kaori-seasons/recommnd-supplier-system AI推荐供应商采购系统
- 1.3 token 计算
  去年开源之夏sentinel-go尝试做基于token预估的token限流的时候，有同学调研了一下token的计算方式。除了tiktoken-go之外，发现部分厂商其实也有提供api来支持计算token。之前测算过对于部分厂商tiktoken-go的计算结果和实际token结果会有一定差异，直接通过厂商api去计算token也可以作为一种更精确的实现考量哈。

下面是之前调研的一些厂商提供的api：

mistral：https://docs.mistral.ai/guides/tokenization/
gemini：https://ai.google.dev/gemini-api/docs/tokens?hl=zh-cn&lang=go
claude code：https://docs.anthropic.com/zh-TW/docs/build-with-claude/token-counting
腾讯混元：https://cloud.tencent.com/document/product/1729/101835
字节豆包：https://www.volcengine.com/docs/82379/1528728
智谱：https://docs.bigmodel.cn/api-reference/%E6%A8%A1%E5%9E%8B-api/%E6%96%87%E6%9C%AC%E5%88%86%E8%AF%8D%E5%99%A8

2 很多参数是硬写在代码里的，需要放到 pixiu-admin 里面；
3 pixiu dubbogo能力升级，如泛化调用要升级到 dubbo v3；
4 API 网关能力增强：供一个可插拔的请求校验 + 请求处理能力，让业务方无需再关心字段长度、必填、枚举等基础校验逻辑，统一由 OpenAPI 规范驱动。具体见 issue [FEATURE] 提供一个可插拔的请求校验 + 请求处理能力，让业务方无需再关心字段长度、必填、枚举等基础校验逻辑，统一由 OpenAPI 规范驱动。 #857
5 动态 LDS: 通过 xDS/pixiu-admin 控制面动态下发 Listener 配置，而不需要重启进程或改静态配置文件。
6 Pixiu Ingress 继续演进。

AlexStocks · 2026-01-09T10:09:55Z

AlexStocks
Jan 9, 2026
Collaborator Author

静态 LDS（传统）

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 8080

特点：
1 改配置 = 重启 pixiu
2 不适合大规模 / 多租户 / Mesh
3 无法按需下发

✅ 动态 LDS（你问的这个）

dynamic_resources:
  lds_config:
    ads: {}

Pixiu 启动后：
1 连接 xDS Server（Istio / Control Plane / 自研）
2 订阅 Listener 列表
3 控制面随时 push / revoke / 更新 listener

0 replies

Ray7788 · 2026-01-09T14:19:04Z

Ray7788
Jan 9, 2026

发展演进这块不用太担心，这块不管是文档还是设计其实我整理很平滑的了，比如在任务里面有一个流量管理，我虽然只实现了 http的，其实是一个很好的功能例子，后面继续 triple 的，第二个是安全的，第三个是可观测，ai 网关的等等，其实都是 dubbogopixiusample 里面的内容，说到底就是对 gateway api 继续扩展

后面我还要支持一个热更新，就是 prometheus 的这种，因为 pixiu 用的是静态文件，所以重启 pixiu proxy 才能刷新configmap 生效；还有 pixiu helm charts 等等

pixiu 现在没有用 ingress 对象，但和 ingress 一样，甚至更好，在文档里面也提到了 ingress 只能用 annotation 扩展，gateway 是可以自定义 api 扩展的，可以自由发挥

0 replies

Oxidaner · 2026-01-09T14:58:09Z

Oxidaner
Jan 9, 2026

Please let me try the token computing functions

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pixiu 2026 work plan #858

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Pixiu 2026 work plan #858

Uh oh!

Uh oh!

AlexStocks Jan 9, 2026 Collaborator

Replies: 3 comments

Uh oh!

AlexStocks Jan 9, 2026 Collaborator Author

Uh oh!

Ray7788 Jan 9, 2026

Uh oh!

Oxidaner Jan 9, 2026

AlexStocks
Jan 9, 2026
Collaborator

AlexStocks
Jan 9, 2026
Collaborator Author

Ray7788
Jan 9, 2026

Oxidaner
Jan 9, 2026