Skip to content

Commit cf52321

Browse files
authored
Merge branch 'main' into fix/pii-lora-auto-detect
2 parents 10f2a66 + ba207ab commit cf52321

File tree

9 files changed

+1169
-3
lines changed

9 files changed

+1169
-3
lines changed
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
name: Anti-Spam Comment Moderator
2+
3+
on:
4+
issue_comment:
5+
types: [created, edited]
6+
pull_request_review_comment:
7+
types: [created, edited]
8+
9+
permissions:
10+
issues: write # needed to delete issue comments
11+
pull-requests: write # needed to delete PR review comments
12+
contents: write # needed to delete commit comments
13+
# (discussions not handled here; API differs)
14+
15+
jobs:
16+
moderate:
17+
if: ${{ github.event.action == 'created' || github.event.action == 'edited' }}
18+
runs-on: ubuntu-latest
19+
steps:
20+
- name: Run spam filter
21+
uses: actions/github-script@v7
22+
with:
23+
script: |
24+
// 1) Collect event/comment info
25+
const ev = context.eventName;
26+
const comment = context.payload.comment || {};
27+
const body = (comment.body || "").trim();
28+
const bodyLower = body.toLowerCase();
29+
const assoc = comment.author_association || "NONE";
30+
const actor = comment.user?.login || "unknown";
31+
const owner = context.repo.owner;
32+
const repo = context.repo.repo;
33+
34+
// Block specific user outright
35+
if ((actor || "").toLowerCase() === "phuole818") {
36+
try {
37+
if (ev === "issue_comment") {
38+
await github.rest.issues.deleteComment({ owner, repo, comment_id: comment.id });
39+
core.notice(`Deleted comment from blocked user @${actor} (issue comment).`);
40+
} else if (ev === "pull_request_review_comment") {
41+
await github.rest.pulls.deleteReviewComment({ owner, repo, comment_id: comment.id });
42+
core.notice(`Deleted comment from blocked user @${actor} (PR review comment).`);
43+
} else if (ev === "commit_comment") {
44+
await github.rest.repos.deleteCommitComment({ owner, repo, comment_id: comment.id });
45+
core.notice(`Deleted comment from blocked user @${actor} (commit comment).`);
46+
} else {
47+
core.warning(`Unhandled event while blocking user: ${ev}`);
48+
}
49+
} catch (err) {
50+
core.setFailed(`Failed to delete blocked user's comment: ${err?.message || err}`);
51+
}
52+
return;
53+
}
54+
55+
// 2) Skip trusted roles or explicitly allowed text
56+
const trustedRoles = new Set(["OWNER","MEMBER","COLLABORATOR"]);
57+
if (trustedRoles.has(assoc)) {
58+
core.info(`Skipping trusted author (${assoc}) @${actor}`);
59+
return;
60+
}
61+
if (/#allow|#nospamfilter/i.test(body)) {
62+
core.info("Skipping due to explicit allow tag in comment.");
63+
return;
64+
}
65+
66+
// 3) Heuristic + sentiment-lite checks
67+
// Link analysis with domain allowlist (do not penalize common safe docs/code links)
68+
const safeDomains = [
69+
"github.com","docs.github.com","githubusercontent.com","gitlab.com","bitbucket.org",
70+
"readthedocs.io","arxiv.org","pypi.org","npmjs.com","crates.io","stackoverflow.com","stackexchange.com"
71+
];
72+
const urlMatches = (body.match(/https?:\/\/[^\s)]+/gi) || []);
73+
let safeLinkCount = 0;
74+
let suspiciousLinkCount = 0;
75+
for (const u of urlMatches) {
76+
try {
77+
const h = new URL(u).hostname.replace(/^www\./i, "");
78+
const isShortHost = /^(bit\.ly|t\.co|tinyurl\.com|goo\.gl|ow\.ly)$/i.test(h);
79+
const isSafe = safeDomains.some(d => h === d || h.endsWith(`.${d}`));
80+
if (isSafe && !isShortHost) safeLinkCount += 1;
81+
else suspiciousLinkCount += 1;
82+
} catch {
83+
suspiciousLinkCount += 1;
84+
}
85+
}
86+
const linkCount = urlMatches.length;
87+
const emailCount = (body.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}/gi) || []).length;
88+
const phoneCount = (body.match(/(\+?\d[\d\s().-]{8,}\d)/g) || []).length;
89+
const mentions = (body.match(/@\w{1,39}/g) || []).length;
90+
const exclaimBlk = /!{3,}/.test(body);
91+
const repeatedChr = /(.)\1{6,}/.test(body);
92+
const shortened = /https?:\/\/(?:bit\.ly|t\.co|tinyurl\.com|goo\.gl|ow\.ly)\//i.test(body);
93+
94+
const lettersOnly = body.replace(/\s/g, "");
95+
const uniqueRatio = lettersOnly.length ? (new Set(lettersOnly).size / lettersOnly.length) : 1;
96+
const lowUnique = lettersOnly.length > 80 && uniqueRatio < 0.30;
97+
98+
// English/ASCII spam terms (word-boundary safe)
99+
const blacklistAscii = [
100+
"whatsapp","telegram","crypto","forex","investment","binary options","broker",
101+
"dm me","contact me","private message","girls","porn","xxx","nude","sex",
102+
"loan approval","free followers","click here","visit my profile","earn $","% off",
103+
"sugar daddy","promo code","join my group","passive income","weixin","vx","wx"
104+
];
105+
// Chinese/CJK spam phrases (substring match; \b doesn't work for CJK)
106+
const blacklistCJK = [
107+
"微信","加我微信","添加微信","VX","V信","私信","联系我","电报","比特币","加密货币","外汇","投资","理财","二元期权",
108+
"裸聊","色情","黄片","成人网站","约炮","兼职","推广","优惠","促销","关注我","点击这里","访问我的主页","我的主页",
109+
"加入群","交流群","被动收入","糖爹","金主","优惠码","贷款","快速贷款","网贷","免费粉丝","粉丝增长",
110+
"赚快钱","快速赚钱","轻松赚钱","保证收益","零风险","无风险","稳赚","返利","优惠券"
111+
];
112+
const asciiHit = blacklistAscii.some(k => new RegExp(`\\b${k.replace(/[-/\\^$*+?.()|[\]{}]/g, "\\$&")}\\b`, "i").test(body));
113+
const cjkHit = blacklistCJK.some(k => body.includes(k));
114+
const keywordHit = asciiHit || cjkHit;
115+
const hype = /(100%|guarantee|risk[- ]?free|no (fees|risk)|quick money|make money)/i.test(body) ||
116+
/(保证|无风险|零风险|快速赚钱|轻松赚钱|立即联系|添加微信|加我微信|稳赚|包赚)/.test(body);
117+
118+
// Attack/Insult/Tech-context term lists (EN + CJK)
119+
const attackTermsAscii = [
120+
"fake stars","astroturf","bot accounts","paid stars","star farming","star boosting","shill",
121+
"manipulated stars","kpi","kpi boosting","no maintainer","ignore issues","ignore prs",
122+
"close pr","close issue","no response","waste of time","trash project","scam project",
123+
"archive this project","unmaintained","low quality docs","unreadable docs","pitfall","avoid this project"
124+
];
125+
const attackTermsCJK = [
126+
"刷星","水军","kpi刷单","假号","买粉","造假","刷榜",
127+
"别踩坑","大坑","浪费时间","赶紧换","不靠谱","建议归档","建议archive",
128+
"没人理你","没人管","装没看见","秒关","石沉大海",
129+
"问题一大堆","一塌糊涂","堪忧","离谱","看不懂","入不了门",
130+
"警告","大踩雷","失望透顶","全靠刷星","社区大踩雷"
131+
];
132+
const insultTermsAscii = [
133+
"trash","garbage","bullshit","idiot","moron","stupid","dumb","shameful","useless"
134+
];
135+
const insultTermsCJK = [
136+
"垃圾","辣鸡","废物","弱智","傻逼","脑残","狗屎","丢人"
137+
];
138+
const techContextAscii = [
139+
"bug","repro","reproduce","steps to reproduce","minimal repro","expected","actual",
140+
"stack trace","traceback","stacktrace","log","logs","error","panic","poc","cve",
141+
"version","v1","v2","v3","config","configuration","file","line","code snippet"
142+
];
143+
const techContextCJK = [
144+
"复现","复现步骤","最小复现","期望行为","实际行为","堆栈","栈追踪","日志","报错",
145+
"版本","配置","文件","行号","代码片段","poc","cve"
146+
];
147+
148+
const escapeRe = (s) => s.replace(/[-/\\^$*+?.()|[\]{}]/g, "\\$&");
149+
const countMatchesAscii = (terms) =>
150+
terms.reduce((n, k) => n + (new RegExp(`\\b${escapeRe(k)}\\b`, "i").test(body) ? 1 : 0), 0);
151+
const countMatchesCJK = (terms) =>
152+
terms.reduce((n, k) => n + (body.includes(k) ? 1 : 0), 0);
153+
154+
const attackHits = countMatchesAscii(attackTermsAscii) + countMatchesCJK(attackTermsCJK);
155+
const insultHit = (countMatchesAscii(insultTermsAscii) + countMatchesCJK(insultTermsCJK)) > 0;
156+
const techCtxHit = (countMatchesAscii(techContextAscii) + countMatchesCJK(techContextCJK)) > 0;
157+
const strongCJK = /(失望透顶|离谱|警告|大踩雷)/.test(body);
158+
159+
// Sentiment-lite (AFINN-style mini-lexicon)
160+
const afinn = {
161+
"amazing": 2, "great": 2, "free": 1, "guaranteed": -1,
162+
"scam": -3, "profit": 1, "winner": 1, "urgent": -1, "risk-free": -2
163+
};
164+
const tokens = body.toLowerCase().split(/[^a-z0-9+\-]+/);
165+
let sentiment = 0;
166+
for (const t of tokens) if (afinn[t] != null) sentiment += afinn[t];
167+
168+
// Score: Only use attack/insult signals for blocking (ignore links/emails/phones)
169+
let points = 0;
170+
// Attack/insult scoring with guardrails for technical context
171+
let attackContribution = 0;
172+
if (insultHit) attackContribution += 2;
173+
if (attackHits >= 3) attackContribution += 2;
174+
else if (attackHits >= 1) attackContribution += 1;
175+
if ((exclaimBlk || strongCJK) && attackContribution > 0) attackContribution += 1;
176+
if (techCtxHit) attackContribution = Math.min(1, attackContribution); // cap if technical context detected
177+
points += attackContribution;
178+
179+
core.info(`Spam score for @${actor} = ${points} (attackOnly; links/emails/phones ignored) (links:${linkCount} safe:${safeLinkCount} suspicious:${suspiciousLinkCount}, emails:${emailCount}, phones:${phoneCount}, mentions:${mentions}, sentiment:${sentiment}, attackHits:${attackHits}, insult:${insultHit}, techCtx:${techCtxHit})`);
180+
181+
// Only block when attack/insult crosses threshold
182+
const isSpam = attackContribution >= 2; // adjust threshold if needed
183+
if (!isSpam) {
184+
core.info("Comment not flagged as spam.");
185+
return;
186+
}
187+
188+
// 4) Delete the comment using the appropriate endpoint
189+
try {
190+
if (ev === "issue_comment") {
191+
await github.rest.issues.deleteComment({
192+
owner, repo, comment_id: comment.id
193+
});
194+
core.notice(`Deleted spam issue comment from @${actor}.`);
195+
} else if (ev === "pull_request_review_comment") {
196+
await github.rest.pulls.deleteReviewComment({
197+
owner, repo, comment_id: comment.id
198+
});
199+
core.notice(`Deleted spam PR review comment from @${actor}.`);
200+
} else if (ev === "commit_comment") {
201+
await github.rest.repos.deleteCommitComment({
202+
owner, repo, comment_id: comment.id
203+
});
204+
core.notice(`Deleted spam commit comment from @${actor}.`);
205+
} else {
206+
core.warning(`Unhandled event: ${ev}`);
207+
}
208+
} catch (err) {
209+
core.setFailed(`Failed to delete comment: ${err?.message || err}`);
210+
}
211+
212+

deploy/kubernetes/istio/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
1-
# vLLM Semantic Router as ExtProc server for Istio Gateway
1+
# vLLM Semantic Router as ExtProc server for Istio Gateway
2+
3+
This guide provides step-by-step instructions for deploying the vLLM Semantic Router (vSR) with Istio Gateway on Kubernetes. Istio Gateway uses Envoy under the covers so it is possible to use vSR with it. Istio is a common choice for the gateway when using Kubernetes Gateway API Inference Extension and in the LLM-D project as well as in common Kubernetes distributions such as Red Hat Openshift. In our experience, there are low level differences in how different Envoy based gateways process the ExtProc protocol to assist with LLM inference, hence this guide and some others cover the specific case of vSR working with an Istio based gateway.
4+
5+
There are multiple deployment guides in this repo related to vSR+Istio deployments. This current document describes deployment of vSR with Istio gateway and two local LLMs served using vLLM. Additional deployment guides in this repo build on this deployment to add support for integrating LLM-D and to illustrate support for routing to remote/ public cloud LLMs. Those topics are covered by other followup deployment guides in this repo ([llm-d guide](../llmd-base/README.md) and [public llm routing guide](../llmd-base/llmd+public-llm/README.md).
6+
7+
With that background context in mind, we now follow this guide to describe the vSR + Istio + locally hosted LLMs use case. After this guide, the reader may then optionally choose to follow up with the additional guides linked above to deploy the more advanced use cases.
28

3-
This guide provides step-by-step instructions for deploying the vLLM Semantic Router (vsr) with Istio Gateway on Kubernetes. Istio Gateway uses Envoy under the covers so it is possible to use vsr with it. However there are differences between how different Envoy based Gateways process the ExtProc protocol, hence the deployment described here is different from the deployment of vsr alongwith other types of Envoy based Gateways as described in the other guides in this repo. There are multiple architecture options possible to combine Istio Gateway with vsr. This document describes one of the options.
4-
59
## Architecture Overview
610

711
The deployment consists of:

0 commit comments

Comments
 (0)